How does the architecture prevent training-serving skew?

Feature stores centralise feature definitions with a single computation specification for both training and serving. Feature parity enforcement ensures production features match training features exactly.

What automation bias countermeasures should the human oversight interface implement?

Present underlying data before the recommendation, require minimum review duration, display confidence indicators prominently, and periodically present calibration cases with known outcomes to assess operator engagement.

What infrastructure must be documented for cloud-hosted AI systems?

Cloud provider, EU/EEA deployment region, specific services used, instance types with GPU/TPU specs, data processing agreements, managed AI service dependencies, container configurations, and disaster recovery plans.

What diagram types should the AISDP include?

System Context (C4 Level 1), Container (C4 Level 2), Component (C4 Level 3), Data Flow, Deployment, and Sequence diagrams. Each serves a different audience and abstraction level, from governance reviewers to engineering teams.

What diagram types should the AISDP include?

System Context (C4 Level 1), Container (C4 Level 2), Component (C4 Level 3), Data Flow, Deployment, and Sequence diagrams. Each serves a different audience and abstraction level, from governance reviewers to engineering teams.

AI system architecture for EU AI Act compliance is structured around an eight-layer reference model: data ingestion, feature engineering, model inference, post-processing, explainability, human oversight interface, logging and audit, and monitoring. Each layer implements controls against intent drift (the system deviating from documented purpose) and outcome drift (outputs changing over time affecting fairness or safety). The data ingestion layer enforces schema validation and prohibited feature blocking. Feature engineering maintains training-serving consistency through feature stores. Model inference pins specific versions from the registry. Post-processing documents every business rule's fairness impact. The human oversight interface enforces mandatory review workflows with automation bias countermeasures. Logging provides immutable, tamper-evident records for Article 12. Monitoring tracks multi-dimensional drift across performance, fairness, and operational metrics. Infrastructure documentation covers cloud deployment, containerisation, edge deployment, multi-region data sovereignty, and disaster recovery.

What architectural documentation does the AISDP require?

Engineering Approach

AISDP Module 3 requires a description of the system's architecture, model type, algorithmic approach, key design choices, inputs and outputs, and the human-machine interaction design.

AISDP Module 3 requires a description of the system's architecture, model type, algorithmic approach, key design choices, inputs and outputs, and the human-machine interaction design. The Technical SME describes the architecture at a level of detail sufficient for a qualified technical reviewer to understand the system's structure and behaviour. If a competent external reviewer cannot reconstruct the system's design rationale and operational behaviour from the documentation, the documentation is insufficient.

Before any architectural work begins, the Business Owner grounds the development in a clear articulation of business intent, ethical commitment, and transparency principles. The statement of business intent must be precise: "to assist human recruiters in screening high-volume applications by ranking candidates against role-specific competency profiles" is adequate, whereas "to improve recruitment efficiency" is too vague to constrain design decisions. The ethical framework must translate principles into concrete design constraints: "the system must achieve a selection rate ratio of at least 0.90 across all measured protected characteristic subgroups" rather than "the system must not discriminate."

The architectural documentation should include multiple diagram types at different abstraction levels. System Context diagrams (C4 Level 1) establish the system boundary and external connections. Container diagrams (C4 Level 2) show major technical building blocks with technology choices. Component diagrams (C4 Level 3) show internal structure within complex containers. Data Flow diagrams trace the path of data from ingestion to output, essential for Article 12 traceability. Deployment diagrams show the physical or cloud infrastructure. Sequence diagrams illustrate critical interaction patterns including the human oversight workflow.

Interface specifications complete the documentation: API contracts via OpenAPI or Protocol Buffers, data contracts with schemas and value range expectations, and human interface specifications showing the information presented to operators and the workflow enforced. The Technical SME versions all diagrams alongside code and model artefacts to prevent documentation drift.

How is the eight-layer reference architecture structured?

Engineering Approach

A high-risk AI system designed for EU AI Act compliance is structured as a layered architecture where each layer provides specific protections against intent drift, where the system's behaviour diverges from its stated purpose, and outcome drift, where the system's outputs change over time affecting fairness, accuracy, or safety. The eight layers are: data ingestion, feature engineering, model inference, post-processing, explainability, human oversight interface, logging and audit, and monitoring.

Each layer implements controls in two categories. Controls against intent drift prevent the system from deviating from its documented intended purpose through technical enforcement rather than policy instruction. Controls against outcome drift detect and alert when the system's behaviour is changing in ways that may affect compliance, enabling timely intervention. The layered approach ensures that failures at one layer are caught by controls at subsequent layers, creating defence in depth.

The architecture feeds into AISDP Module 3 (Architecture and Design) and Module 7 (Human Oversight). Architecture decisions made at design time also have implications for the system's eventual decommissioning; systems designed with clear infrastructure-as-code definitions, isolated credential namespaces, and modular data storage are substantially easier to decommission in a controlled and auditable manner.

How do the data ingestion and feature engineering layers work?

Engineering Approach

The data ingestion layer receives, validates, and normalises input data from deployer systems.

The data ingestion layer receives, validates, and normalises input data from deployer systems. It is the system's first contact with the outside world and the point at which malformed, adversarial, or out-of-distribution data is intercepted. Controls against intent drift include schema validation rejecting non-conforming records with logged errors rather than silent coercion, input range enforcement checking numerical features against training data distributions, prohibited feature blocking as a hard technical control excluding features identified as proxies for protected characteristics, and data minimisation stripping personal data not required for the intended purpose.

Controls against outcome drift include distribution monitoring computing real-time summary statistics and comparing them against the training baseline, and comprehensive logging recording every data record with timestamp, source identifier, validation result, and content hash for Article 12 traceability.

The feature engineering layer transforms raw data into feature vectors consumed by the model. This is where proxy variable risks materialise and where undocumented transformations can introduce hidden bias. A central feature registry records each feature's name, source, transformation logic, expected distribution, business justification, and proxy variable risk assessment. Feature parity enforcement ensures features computed for inference are identical to those used during training, preventing training-serving skew. The proxy variable audit computes each feature's correlation with protected characteristics; features exceeding defined thresholds require documented justification from the Technical SME and AI Governance Lead.

How do the model inference and post-processing layers work?

Engineering Approach

The model inference layer executes the trained model against the feature vector and produces a raw output.

The model inference layer executes the trained model against the feature vector and produces a raw output. Model version pinning ensures the inference service serves a specific, immutable model version from the registry, with updates requiring a deployment event with human approval. Confidence thresholding routes predictions below a defined threshold to human review before being acted upon, preventing the system from acting on uncertain predictions. Output constraint enforcement prevents pathological model behaviour from propagating downstream by enforcing hard constraints on output ranges and classification sets.

Per-prediction feature attribution using SHAP or LIME records feature contributions for each prediction, supporting explainability requirements. Prediction distribution monitoring tracks output distributions in real time to detect shifts indicating evolving model behaviour.

The post-processing layer applies thresholds, calibrations, business rules, and output formatting, shaping raw outputs into actionable results. Every business rule is documented in the aisdp with its rationale, effect on raw output, and interaction with fairness mitigations. Business rules can override model outputs in ways that affect fairness; the Technical SME assesses each rule for fairness impact. Calibration validations confirm that fairness adjustments achieve the intended improvement without unintended side effects. Override logging records every instance where a rule changes the model's raw output.

Threshold stability monitoring tracks the proportion of inputs crossing decision thresholds over time. Changes in crossing rates indicate score distribution shifts that may require threshold recalibration. Fairness metrics computed during development are periodically recomputed on production data passing through the post-processing layer, catching drift that affects final outputs rather than just raw predictions.

How do the explainability and human oversight layers work?

Engineering Approach

The explainability layer generates human-readable explanations of individual predictions, supporting the Article 14 human oversight requirement by providing the information that oversight operators need to evaluate outputs. Explanation fidelity validation ensures explanations reflect the model's actual behaviour; an explanation attributing a decision to Feature A when the model relied on Feature B is worse than no explanation because it misleads the human overseer. Fidelity is tested by comparing feature attributions against the model's sensitivity to feature perturbations.

Explanations must be audience-appropriate. Technical operators receive precise feature contributions and confidence indicators. Affected persons receive plain-language explanations focusing on factors most relevant to their situation. Explanation consistency monitoring detects when the dominant explanation for a particular decision type changes without a corresponding model update, indicating potential instability.

The human oversight interface is the component through which operators review, accept, override, or reject outputs. A mandatory review workflow enforces a review step before any output is acted upon; auto-acceptance configurations are technically prevented for high-risk systems. Automation bias countermeasures include presenting underlying data before revealing the system's recommendation, requiring minimum review duration, displaying confidence indicators prominently, and periodically presenting calibration cases with known outcomes.

Override capability is mandatory, with every override logged with operator identity, original recommendation, override decision, and stated rationale. Override rate monitoring at aggregate, per-deployer, and per-operator levels tracks system health. Review time monitoring uses average review time as a proxy for thoroughness; operators consistently reviewing cases in under 60 seconds are unlikely to be performing meaningful oversight. ensures that interface changes are tracked alongside model changes.

How do the logging and monitoring layers work?

Engineering Approach

The logging and audit layer captures a comprehensive record of system operation supporting Article 12's automatic recording requirements.

The logging and audit layer captures a comprehensive record of system operation supporting Article 12's automatic recording requirements. Immutable logging in append-only format with cryptographic hash chains ensures tamper evidence; no system component, user, or administrator can modify historical entries. Comprehensive event coverage captures every material event: data ingestion, feature computation, model inference, post-processing, explanation generation, operator actions, configuration changes, deployment events, and monitoring alerts.

Log-based drift detection feeds aggregated data to the monitoring layer's algorithms. Changes in inference patterns, error rates, or operator behaviour provide early warning of outcome drift. A regulatory export capability supports export in formats suitable for inspection within competent authority response timelines.

The monitoring layer continuously observes operational behaviour across performance, fairness, data quality, and anomalous patterns. Intent alignment dashboards display the system's current behaviour relative to documented intended purpose with clear indication of whether the system is within specification. Anomaly detection identifies unusual patterns in inputs, outputs, or operational metrics, triggering alerts and above defined severity thresholds automatic escalation.

Multi-dimensional drift monitoring tracks drift across input feature distributions, output score distributions, fairness metrics, error rates, and operator override rates simultaneously. Single-dimension monitoring may miss drift that manifests across multiple dimensions without crossing any individual threshold. Feedback loop detection includes specific checks for effects where the system's outputs influence data subsequently used to evaluate or retrain the system, requiring comparison of training and production distributions while controlling for the system's own influence. The monitoring outputs feed into the framework.

What infrastructure documentation is required?

Engineering Approach

Article 15 and Annex IV require documentation of the hardware and software environment.

Article 15 and Annex IV require documentation of the hardware and software environment. For cloud-hosted systems, the AISDP must specify the cloud provider, deployment region within the EU/EEA for high-risk systems processing personal data, specific services used across compute, databases, orchestration, model serving, and logging, and instance types with resource allocations including GPU/TPU specifications.

Cloud provider data processing agreements must be in place and referenced. Many organisations use managed AI services such as AWS SageMaker, Google Vertex AI, or Azure Machine Learning; the AISDP documents which services are used, what data flows through them, the provider's data handling practices, availability SLAs, and fallback strategies. For third-party model APIs, the documentation additionally covers the provider's model versioning policy, data retention practices, latency characteristics, and contractual commitments regarding behaviour stability.

Containerisation with Docker and orchestration with Kubernetes provide reproducible, versioned deployment environments. Module 3 captures the container image build process, private registry with access controls and image signing, orchestration configuration, and resource limits. Each container image is immutable, tagged with corresponding code and model versions, and stored in a private registry with access logging.

Disaster recovery and business continuity planning defines the recovery point objective and recovery time objective for the system. For high-risk AI systems, a model serving failure that forces deployers to make decisions without AI support may itself be a compliance concern. The disaster recovery plan covers model artefact backup and restoration, data pipeline failover, monitoring continuity during recovery, and communication to deployers during outages. Recovery procedures are tested at least annually with results documented in the AISDP.

What considerations apply to edge and multi-region deployment?

Compensating Controls

Edge and on-premises deployments create compliance challenges distinct from cloud hosting.

Edge and on-premises deployments create compliance challenges distinct from cloud hosting. The system runs on infrastructure the provider does not control, making monitoring more difficult, updates harder to enforce, and incident response slower. The AISDP documents the edge deployment model including hardware specifications and minimum requirements, the model update mechanism covering how new model versions are delivered, validated, and activated, the monitoring approach defining what data is collected locally and what is transmitted to the provider, and the rollback procedure specifying how a faulty update is reversed.

For systems deployed across multiple EU member states, data sovereignty requirements add complexity. The AISDP documents the data residency policy for each data category, the mechanism ensuring personal data is processed within the declared region, the regulatory mapping showing which national competent authority has jurisdiction over which deployment, and the approach to language and localisation requirements. Some member states may impose additional requirements through national implementation measures; the Legal and Regulatory Advisor monitors the regulatory landscape in each deployment jurisdiction.

Multi-region architectures must ensure that monitoring data from all regions feeds into a unified PMM framework. A system that is compliant in one region but drifting in another must be detected through cross-region analysis. Data transfer between regions for monitoring purposes must comply with GDPR transfer rules, which typically requires an adequacy decision, standard contractual clauses, or processing within the EEA.

For organisations at earlier maturity levels, infrastructure can be documented with standard diagrams and spreadsheets. A cloud resource inventory spreadsheet listing every service, its region, and its purpose provides the foundation. Architecture diagrams maintained in standard drawing tools and reviewed quarterly against the deployed environment prevent drift. The key principle is that documented infrastructure must match deployed infrastructure; any discrepancy is a non-conformity.

AI System Architecture: The Eight-Layer Reference Model

Written by

What architectural documentation does the AISDP require?

How is the eight-layer reference architecture structured?

How do the data ingestion and feature engineering layers work?

How do the model inference and post-processing layers work?

How do the explainability and human oversight layers work?

How do the logging and monitoring layers work?

What infrastructure documentation is required?

What considerations apply to edge and multi-region deployment?

Frequently Asked Questions

Related Pages

Start your compliance journey