We use cookies to improve your experience and analyse site traffic.
The EU AI Act requires organisations to assess whether AI models align with EU values, fundamental rights, and regulatory expectations. Foundation models reflect the norms of their training jurisdiction, creating compliance gaps that must be evaluated and documented in the AI System Description and Performance document.
Foundation models encode the values, cultural norms, and policy perspectives of the jurisdictions where they were trained and aligned.
Foundation models encode the values, cultural norms, and policy perspectives of the jurisdictions where they were trained and aligned. Organisations deploying ai systems in the EU must assess whether each model's behaviour aligns with EU values, fundamental rights, and regulatory expectations. This assessment covers training data composition, alignment and safety tuning processes, the model provider's governance structure, and the hosting infrastructure's exposure to foreign legal jurisdictions. Model Selection and Third-Party Risk provides the broader procurement framework within which this geopolitical assessment sits.
A model that performs acceptably in its home jurisdiction may produce outputs that breach EU law or fail to account for European regulatory concepts, institutional structures, and cultural contexts. The risk is not hypothetical: differences in content moderation standards, consumer protection frameworks, and data protection regimes between the EU and other major AI-producing jurisdictions create measurable compliance gaps.
Models trained predominantly on data from a single jurisdiction may perform poorly or produce inappropriate outputs when applied to EU populations.
Models trained predominantly on data from a single jurisdiction may perform poorly or produce inappropriate outputs when applied to EU populations. A credit scoring model trained on US financial behaviour data may not generalise to European credit markets, which operate under different consumer protection frameworks. A natural language processing model trained on American English may handle EU-specific legal terminology, regulatory references, or cultural contexts inadequately.
For non-LLM systems, geographical bias testing focuses on whether the model's performance varies across EU member state populations. A credit risk model trained on or influenced by US credit market data may assign systematically different risk scores to applicants whose financial profiles reflect European consumer behaviour. The evaluation dataset should be representative of the deployment population in each target member state. The Technical SME disaggregates performance metrics by member state or region, and significant performance variations indicate geographical bias requiring additional training data, fine-tuning, or post-processing calibration.
Foundation model providers apply alignment and safety tuning processes that reflect their organisational values and home regulatory environment.
Foundation model providers apply alignment and safety tuning processes that reflect their organisational values and home regulatory environment. A model aligned primarily to US free speech norms may not meet EU expectations regarding hate speech, disinformation, or content moderation. For the aisdp, the key records are how the organisation has evaluated the model's alignment against EU legal and ethical standards and what additional fine-tuning or guardrails have been applied to address identified gaps.
The gap is structural: most widely available foundation models were trained predominantly on English-language data and aligned to the legal norms of the United States. A model aligned to US free speech norms may produce outputs that would be classified as illegal hate speech under the laws of several EU member states. A model trained primarily on US financial data may mishandle EU-specific concepts such as SEPA payment schemes, GDPR data subject rights, or the European Consumer Credit Directive. These compliance gaps must be documented and addressed in the AISDP.
Models developed by entities with close ties to foreign governments may incorporate content filtering, censorship, or bias inconsistent with EU values.
Models developed by entities with close ties to foreign governments may incorporate content filtering, censorship, or bias inconsistent with EU values. The risk assessment must evaluate the model provider's ownership structure, governance arrangements, and any known government relationships, particularly where the model will be used in sensitive domains such as law enforcement, migration, or public administration.
If the model's inference or training infrastructure is hosted outside the EU, the organisation must assess the risk that foreign governments could compel access to the model, its data, or its outputs under their domestic laws. Under the US CLOUD Act, US authorities can compel US-based cloud providers to disclose data stored anywhere in the world. China's National Intelligence Law requires Chinese organisations to cooperate with state intelligence work. Cybersecurity and Infrastructure Security covers the infrastructure dimensions of these risks in greater detail.
These extraterritorial access risks intersect with GDPR data transfer requirements. The AI System Assessor documents these risks in the AISDP's cybersecurity and data governance modules, which may influence infrastructure decisions such as choosing an EU-based cloud region or self-hosting inference infrastructure within the EU.
Behavioural benchmarking is the primary mitigation for nation-alignment risk.
Behavioural benchmarking is the primary mitigation for nation-alignment risk. A test suite is built or acquired that exercises the model against EU-specific scenarios and evaluates whether outputs meet EU legal and ethical standards. For hate speech, testing uses the thresholds defined in the Framework Decision on combating racism and xenophobia (2008/913/JHA) and relevant national implementations, not the broader US First Amendment standard. For content moderation, testing applies the Digital Services Act's obligations. For domain-specific applications, testing uses EU regulatory terminology, legal constructs, and institutional references.
HELM (Stanford's Holistic Evaluation of Language Models) provides a multi-dimensional benchmarking framework that includes cultural bias dimensions. It evaluates models across accuracy, calibration, robustness, fairness, bias, toxicity, and efficiency; the toxicity and bias dimensions can be configured with EU-specific thresholds. MMLU (Massive Multitask Language Understanding) tests factual knowledge across domains, and EU-specific MMLU subsets can be constructed by adding questions about EU institutions, legislation, and cultural context. TruthfulQA evaluates the model's tendency to produce truthful answers, which is relevant for systems that generate factual claims.
Many AI model providers, particularly those offering cloud-based APIs or SaaS platforms, collect data from their customers' usage.
Many AI model providers, particularly those offering cloud-based APIs or SaaS platforms, collect data from their customers' usage. This data may include the inputs submitted by the organisation's users, the outputs generated by the model, usage patterns and metadata, and feedback signals such as user corrections or ratings.
Provider data collection presents several risks. The provider may use customer data to improve its models, meaning that the organisation's proprietary data, and potentially the personal data of affected individuals, is incorporated into the provider's training corpus. Data retention and processing practices may conflict with GDPR requirements. Aggregated or anonymised data may be shared with third parties, and terms of service may grant broad data usage rights that the organisation has not scrutinised.
Module 3 of the AISDP records the provider's data collection practices, the data processing agreement in place, the measures taken to prevent personal data leakage to the provider (such as pseudonymisation of inputs before API calls), and any residual risks. Where the provider is also a gpai model provider under Article 53, the downstream AI System Assessor documents provider obligations regarding data handling.
AI systems that learn or adapt over time present a distinctive risk: the system's behaviour changes as it ingests new data, is retrained on updated datasets, or undergoes algorithmic modifications.
AI systems that learn or adapt over time present a distinctive risk: the system's behaviour changes as it ingests new data, is retrained on updated datasets, or undergoes algorithmic modifications. This evolution can improve performance, but it can also introduce unpredictable shifts in the system's outputs that create new compliance concerns.
As the system processes more data, the distribution of its training corpus evolves. New data may introduce previously unrepresented populations, novel patterns, or distributional shifts that alter the model's decision boundaries. A recruitment screening system that initially performed well across demographic groups may develop disparate impact as accumulating hiring outcome data from a non-representative set of deployers skews the training distribution. The post-market monitoring system must track these accumulation effects through continuous fairness metric monitoring and periodic retraining evaluations.
Changes to the model's algorithm, hyperparameters, or ensemble composition can produce non-linear shifts in outputs that are difficult to predict from the change description alone. A seemingly minor hyperparameter adjustment, such as modifying the learning rate or changing regularisation strength, can cascade through the model's decision space, altering outcomes for specific subgroups in ways that aggregate performance metrics may not reveal. The AISDP must document the testing protocol applied after each algorithmic change, including disaggregated performance evaluation across all protected characteristic subgroups.
Systems whose outputs influence their future training data are susceptible to feedback loops. A predictive policing system that directs more patrols to certain neighbourhoods will generate more arrests there, reinforcing the system's belief that those neighbourhoods are high-crime and directing even more patrols. The must identify potential feedback loops, model their amplification dynamics, and design circuit breakers such as caps on the system's influence over its own training data, mandatory human review thresholds, or periodic recalibration against external benchmarks.
The geopolitical risk assessment produces a documented position in the AISDP that records the full chain of evaluation and mitigation decisions.
The geopolitical risk assessment produces a documented position in the AISDP that records the full chain of evaluation and mitigation decisions. This position covers the model provider's jurisdiction and governance structure, the alignment testing performed against EU standards, the results and any gaps identified, the additional fine-tuning or guardrails applied to address those gaps, the infrastructure hosting arrangements and data sovereignty analysis, and the residual risks accepted. Post-Market Monitoring addresses the ongoing re-assessment requirements.
For organisations using third-party foundation models, the AI System Assessor repeats this assessment whenever the model provider releases a new version, because alignment and safety tuning can change between versions. The organisation must also define quantitative thresholds for determining when a model evolution constitutes a substantial modification under Article 3(23). Triggers might include a change in AUC-ROC exceeding a defined tolerance, any subgroup fairness metric breaching the established threshold, a change in the model's feature importance ranking that alters the top five features, or the introduction or removal of input features. These thresholds are documented in the AISDP and integrated into the automated quality gates of the deployment pipeline.
Behavioural benchmarking involves building or acquiring test suites that exercise the model against EU-specific scenarios, testing against thresholds from EU law such as the Framework Decision on combating racism and xenophobia, rather than US standards.
Providers may use customer data to improve models, incorporating proprietary and personal data into training corpora. Data retention practices may conflict with GDPR, and terms of service may grant broad usage rights the organisation has not scrutinised.
Quantitative thresholds such as changes in AUC-ROC exceeding a defined tolerance, subgroup fairness metric breaches, changes in feature importance rankings, or the introduction or removal of input features may trigger a substantial modification assessment under Article 3(23).
HELM evaluates cultural bias with configurable EU-specific toxicity thresholds, MMLU tests factual knowledge with EU-specific subsets, and TruthfulQA evaluates truthfulness of factual claims.
Foreign governments may compel access to model data under domestic laws such as the US CLOUD Act or China's National Intelligence Law, intersecting with GDPR data transfer requirements.
The AISDP records the provider's jurisdiction, alignment testing results, gaps identified, guardrails applied, infrastructure hosting arrangements, and residual risks, repeated with each new model version.
When the same model is deployed across multiple deployer organisations, each deployer's configuration, data, and usage patterns may cause the system to behave differently. Visibility into these divergences must be maintained to assess whether any deployment has drifted outside the intended conditions of use. The post-market monitoring plan should include cross-deployment consistency checks.