We use cookies to improve your experience and analyse site traffic.
High-risk AI systems built on microservice architectures present a distinctive version control challenge under the EU AI Act. Each independently deployable service can alter overall system behaviour when changed, requiring systematic dependency mapping, change impact analysis, and contract testing to maintain compliance.
High-risk AI systems built on microservice architectures present a version control problem that monolithic systems do not share.
High-risk AI systems built on microservice architectures present a version control problem that monolithic systems do not share. Each service is independently deployable, independently versioned, and may be maintained by a different team. A change to any single service can alter the system's overall behaviour in ways that are difficult to predict from the change in isolation. The eu ai act requires organisations to maintain accurate documentation of the system's architecture and behaviour, which means every service interaction must be tracked and assessed for regulatory impact.
This difficulty arises because microservices communicate across network boundaries through defined interfaces. When one service changes its output format or computational behaviour, the effects propagate through the dependency chain to every downstream consumer. A modification to the data ingestion service that alters how missing values are handled will change the feature vectors produced by the feature engineering service, which will change the model's inference behaviour, which will change the outputs presented to human operators. Without a systematic approach to tracking these cascading effects, the organisation cannot determine whether a change to one service constitutes a substantial modification to the system as a whole. Version Control and Change Management for High-Risk AI Systems covers the broader version control framework within which microservice-specific practices sit.
The organisation must maintain a current dependency map showing how each service communicates with every other service.
The organisation must maintain a current dependency map showing how each service communicates with every other service. This map documents the data contracts between services, including schemas, formats, and expected value ranges. It also captures the sequence in which services process data for a given inference request and the failure modes that propagate across service boundaries. This dependency map is not an architectural convenience; it is a compliance artefact. Without it, the organisation cannot assess whether a change to one service constitutes a substantial modification to the system as a whole.
There are two approaches to dependency mapping, and the Technical SME uses both. Declared dependencies are captured in a service catalogue; Backstage is the most widely adopted open-source option. Each service has a catalogue entry listing its upstream dependencies (data sources, APIs, and infrastructure services) and its downstream consumers (applications, deployers, and reporting systems). The catalogue entries are version-controlled as YAML files in a Git repository, making changes to the dependency graph auditable.
Observed dependencies are discovered through distributed tracing. Jaeger, Zipkin, and cloud-native tracing services such as X-Ray, Application Insights, and Cloud Trace instrument the runtime environment and capture the actual call graph for every request. This reveals dependencies that the declared documentation may have missed: a logging service that the application calls but that nobody documented, a DNS resolution step that introduces a dependency on an external resolver, or a health check endpoint that a load balancer calls. The discrepancy between declared and observed dependencies is itself a finding that warrants investigation.
Each dependency in the map should be classified along three dimensions to support risk assessment and regulatory documentation.
Each dependency in the map should be classified along three dimensions to support risk assessment and regulatory documentation. The first dimension is criticality: whether the system would fail if this dependency became unavailable. The second is data sensitivity: whether personal data flows to or from this dependency. The third is change risk: how frequently this dependency changes and what the notification mechanism is.
Architecture diagrams generated from the dependency data, using tools such as Mermaid or Graphviz and stored as code in the repository, serve as Module 3 evidence for the aisdp. The Technical SME regenerates the diagrams periodically, on a quarterly cadence aligned with the risk review cycle, and compares each version against the previous one to detect undocumented changes. Regulatory Documentation Maintenance provides guidance on maintaining this documentation over time.
Where tooling is not available, a service dependency map can be maintained as a manually drawn architecture diagram using Mermaid or draw.io, with an accompanying register in spreadsheet form. The register lists each dependency, its criticality classification, data sensitivity, and change notification mechanism. The spreadsheet should include columns for dependency name, type (API, data, or infrastructure), criticality (critical, important, or convenience), data sensitivity (whether personal data flows through it), change notification mechanism, and last verified date. Quarterly review should compare the documented dependencies against the actual system through manual inspection of configuration files, network connections, and API integrations. The manual approach only captures dependencies the team is aware of, losing automated discovery of undocumented dependencies. Distributed tracing tools such as Jaeger and Zipkin, both open-source, can supplement the manual approach at no licensing cost.
Before any service is updated, the engineering team must perform a change impact analysis that traces the change's effects through the dependency map.
Before any service is updated, the engineering team must perform a change impact analysis that traces the change's effects through the dependency map. The Technical SME examines each link in the propagation chain, assessing how a modification in one service alters the inputs and outputs of each downstream service. The impact analysis should reference the specific AISDP modules affected and assess whether the combined effect crosses the substantial modification threshold.
The analysis follows the data flow: a change to the data ingestion service alters how missing values are handled, which changes the feature vectors produced by the feature engineering service, which changes the model's inference behaviour, which changes the outputs presented to human operators. Each link in this chain represents a potential point at which the system's compliance status may change. The Technical SME documents the full chain, not merely the first-order effect, because the EU AI Act's substantial modification assessment considers the system's overall behaviour rather than individual component changes.
While each service carries its own version number, the system as a whole must also carry a composite version identifier that captures the specific combination of service versions currently deployed.
While each service carries its own version number, the system as a whole must also carry a composite version identifier that captures the specific combination of service versions currently deployed. This composite version is the version recorded in the AISDP, in the EU database registration, and in the Declaration of Conformity.
A deployment event that changes one microservice changes the composite version, even if the other services remain unchanged. This means that the composite version provides a precise snapshot of the system's state at any point in time, enabling the organisation to reconstruct exactly which combination of service versions was active when a particular decision was made. Audit Trail Design for AI Systems covers the broader audit trail design within which composite versioning operates.
Contract testing addresses a failure mode that integration testing misses: the silent breaking change.
Contract testing addresses a failure mode that integration testing misses: the silent breaking change. When a data provider modifies their API response format, when a feature computation service changes its rounding behaviour, or when a model serving endpoint starts returning confidence scores on a different scale, the dependent system may continue to operate without errors yet produce incorrect results. Contract testing detects these breaks before they reach production.
Consumer-driven contract testing, using tools such as Pact, works by having each consumer of a service define a contract: the consumer specifies the request it will send and the response fields, types, and value ranges it expects. The contracts are stored in a broker and verified against the provider on every provider build. If the provider makes a change that violates a consumer's contract, the provider's build fails before the change is deployed. This prevents the provider from accidentally breaking the AI system's assumptions.
Statistical contract testing, using tools such as Great Expectations applied to data interfaces, extends the concept to data quality. A data consumer defines statistical expectations for each data source: for example, that a particular column has no null values, is non-negative, and has a mean within a defined range of the historical mean. These expectations run on every data delivery. Statistical contracts are particularly important for ML systems because the model's performance depends on the data distribution remaining within the range the model was trained on. A delivery that satisfies the schema contract (correct field names and types) but violates the statistical contract (a shifted distribution) is arguably more dangerous than a delivery that fails the schema check, because it will be silently accepted unless statistical contracts are in place.
Consumer-driven contract testing can be approximated procedurally through documented interface specifications and manual integration verification.
Consumer-driven contract testing can be approximated procedurally through documented interface specifications and manual integration verification. This approach suits organisations that have not yet adopted contract testing tooling but still need to demonstrate compliance with interface validation requirements.
An interface specification document is needed for each dependency, defining the expected request format, response format, value ranges, and response time. A manual integration test checklist ensures that before each release, test requests are submitted to each dependency and the responses verified against the specification. A change notification process should log any changes to a dependency's interface specification and trigger a re-run of the integration test.
The manual approach discovers breaks during testing rather than during the dependency's development, losing automated detection of breaking changes before deployment. Pact and Great Expectations are both open-source and can be adopted at minimal cost, making the transition from procedural to automated contract testing a practical upgrade path for most organisations.
Backstage is the most widely adopted open-source service catalogue. For observed dependencies, Jaeger, Zipkin, and cloud-native tracing services such as X-Ray, Application Insights, and Cloud Trace capture actual call graphs.
Yes. Interface specification documents and manual integration test checklists provide a procedural alternative, though they discover breaks during testing rather than during development. Pact and Great Expectations are both open-source and can be adopted at minimal cost.
A data delivery that satisfies the schema contract but violates statistical expectations (shifted distribution) can be silently accepted and degrade model performance, making statistical contracts arguably more important than schema checks alone.
The engineering team traces a change's effects through the dependency map, examining each link in the propagation chain to assess whether the combined effect crosses the substantial modification threshold.
A composite version identifier capturing the specific combination of service versions deployed, recorded in the AISDP, EU database registration, and Declaration of Conformity.
Consumer-driven and statistical contract testing detect interface changes that would cause dependent systems to produce incorrect results without raising errors.
Documented interface specifications and manual integration verification checklists, with change notification processes to trigger re-testing when dependencies change.
Contract tests must run as part of the CI pipeline for every service. A contract test failure blocks deployment. The contract test suite should be version-controlled alongside the services and referenced in the AISDP as part of the quality management documentation. For AISDP purposes, the Technical SME documents the contract suite in the test strategy as Module 5 evidence, with contract definitions version-controlled alongside the system's code and contract test results retained as pipeline evidence. The contracts themselves serve as executable documentation of the system's interface assumptions.