How do you address training data opacity in GPAI models?

Through behavioural proxy testing with sentinel evaluation datasets covering fairness, accuracy, and safety, combined with output distribution analysis across demographic subgroups using the downstream system's own operational data.

How should teams characterise behavioural unpredictability in foundation models?

Through systematic characterisation covering output determinism, prompt sensitivity, refusal behaviour, hallucination rates, and boundary behaviour, documented in AISDP Module 3 and updated on each version change.

How do you manage version instability in GPAI deployments?

Through version pinning to specific model checkpoints, continuous sentinel monitoring with automated alerts on behavioural shifts, and contractual notification requirements with a 30-day minimum notice period for high-risk systems.

What is a sentinel evaluation dataset and how is it used?

A sentinel evaluation dataset is a versioned test set designed by the Technical SME that exercises fairness, accuracy, and safety dimensions that the GPAI provider's training data disclosures do not address. It includes balanced representation across protected characteristic subgroups, adversarial examples for known bias patterns, edge cases at training distribution boundaries, and multilingual samples. The dataset is re-evaluated whenever the GPAI model version changes.

Can version pinning alone solve the version instability problem?

Version pinning is necessary but not sufficient. Some GPAI providers apply changes within a version identifier without notification, so pinning must be combined with continuous sentinel monitoring that detects behavioural shifts and contractual notification requirements that mandate advance notice of changes.

What is a sentinel evaluation dataset and how is it used?

A sentinel evaluation dataset is a versioned test set designed by the Technical SME that exercises fairness, accuracy, and safety dimensions that the GPAI provider's training data disclosures do not address. It includes balanced representation across protected characteristic subgroups, adversarial examples for known bias patterns, edge cases at training distribution boundaries, and multilingual samples. The dataset is re-evaluated whenever the GPAI model version changes.

Can version pinning alone solve the version instability problem?

Version pinning is necessary but not sufficient. Some GPAI providers apply changes within a version identifier without notification, so pinning must be combined with continuous sentinel monitoring that detects behavioural shifts and contractual notification requirements that mandate advance notice of changes.

Direct integrity verification is not possible for API-accessed models. The sentinel monitoring pipeline provides a behavioural proxy: unexpected changes in the model's behaviour on the sentinel dataset may indicate tampering or replacement. For on-premises deployments, cryptographic hash verification of model artefacts provides direct integrity assurance.

Abstract

Read abstract

General-purpose AI models present a fundamental information asymmetry for downstream deployers building high-risk systems. The GPAI provider controls the training data, model architecture, and update schedule, while the downstream provider bears the compliance burden under Articles 9, 10, and 15. This page details the compensating control architecture that addresses four dimensions of this asymmetry. Training data opacity is mitigated through behavioural proxy testing using sentinel evaluation datasets and output distribution analysis across demographic subgroups. Behavioural unpredictability is addressed through systematic characterisation covering output determinism, prompt sensitivity, refusal patterns, hallucination rates, and boundary behaviour. Version instability is managed through version pinning, continuous sentinel monitoring on a daily schedule for high-risk systems, and contractual notification requirements with a recommended 30-day minimum notice period. Security opacity is countered through prompt injection defence in depth, covering input sanitisation, system prompt isolation, output validation, and action gating, alongside model integrity verification. Each control maps to specific AISDP modules so that auditors reviewing any individual module find both the primary compliance evidence and the compensating control that addresses gaps in the GPAI provider's disclosures.

What is the compensating control framework for GPAI integration?

Regulatory Requirement

The compensating control architecture addresses four dimensions of information asymmetry between GPAI providers and downstream deployers: training data opacity, behavioural unpredictability, version instability, and security opacity. When a gpai provider's disclosures under Article 53 do not give the downstream provider enough information to satisfy the requirements of Articles 9, 10, and 15, the deploying organisation must build its own evidence base through compensating controls.

Each dimension represents a category where the downstream provider cannot directly verify the GPAI model's properties. Training data opacity means the provider cannot inspect or reproduce the model's training data. Behavioural unpredictability arises because foundation models exhibit emergent behaviours that were not explicitly designed. Version instability reflects the reality that GPAI providers update models on their own schedule. Security opacity means the downstream provider cannot audit the GPAI provider's security infrastructure or model integrity controls.

The controls documented here operate at the downstream system level. They do not require cooperation from the GPAI provider beyond the minimum disclosures mandated by Article 53. Where stronger cooperation is available, through contractual mechanisms or voluntary transparency, the controls can be simplified but should not be removed entirely. GPAI Model Integration for High-Risk AI Systems covers the broader integration framework within which these controls sit.

How do you address training data opacity?

Compensating Controls

The downstream provider cannot inspect, audit, or reproduce the GPAI model's training data.

The downstream provider cannot inspect, audit, or reproduce the GPAI model's training data. Article 10's requirements for training data documentation, completeness, representativeness, and bias assessment cannot be satisfied at the pre-training level. The compensating approach operates at two layers: behavioural proxy testing and output distribution analysis.

Behavioural proxy testing requires the Technical SME to design a sentinel evaluation dataset that exercises the fairness, accuracy, and safety dimensions the training data disclosures do not address. The dataset includes balanced representation across the protected characteristic subgroups relevant to the downstream system's intended purpose; adversarial examples testing for known bias patterns in the domain, such as gender bias in recruitment language, racial bias in credit-related terminology, and age bias in clinical descriptions; edge cases that probe the model's behaviour at the boundaries of its training distribution; and multilingual samples where the system serves non-English populations.

The sentinel dataset is versioned, maintained, and re-evaluated whenever the GPAI model version changes. The evaluation results are documented in AISDP Module 4 alongside an explicit statement that the downstream provider could not conduct Article 10 data documentation at the pre-training level and has substituted behavioural proxy testing. Training Data Governance and Documentation provides the broader Article 10 documentation framework that the proxy testing supplements.

Output distribution analysis complements the sentinel dataset approach. The Technical SME analyses the GPAI model's output distribution across demographic subgroups within the downstream system's operational context. This analysis uses the downstream system's own operational data, not the GPAI provider's training data. It tests whether the GPAI model produces systematically different outputs for different subgroups when presented with equivalent inputs. Results are documented in AISDP Module 4 and cross-referenced to the fairness evaluation in Module 5.

How should teams characterise behavioural unpredictability?

Engineering Approach

Foundation models exhibit emergent behaviours that were not explicitly designed, including reasoning capabilities, tool-use abilities, and failure modes that appear at scale and were not present in smaller models. The downstream provider cannot predict or document these behaviours from first principles, making structured characterisation essential.

The Technical SME conducts a systematic behavioural characterisation of the GPAI model within the downstream system's operational context. The characterisation covers five areas. Output determinism assesses whether the model produces the same output given the same input, and at what temperature settings. Sensitivity to prompt phrasing tests whether rephrasing the same question produces materially different outputs. Refusal behaviour identifies the conditions under which the model refuses to produce output, and whether this refusal pattern affects any demographic subgroup disproportionately.

Hallucination rate measurement determines what proportion of outputs contain factual errors and whether the error rate varies by domain or input type. Boundary behaviour assessment examines how the model behaves when inputs fall outside the domain covered by the downstream system's intended purpose. These five areas together provide a structured profile of the GPAI model's behaviour within the specific deployment context.

The characterisation is documented in AISDP Module 3 and updated whenever the GPAI model version changes. The ai system assessor uses the characterisation to inform the risk assessment in Module 6. details the full AISDP structure across all twelve modules.

Managing version instability in GPAI deployments

Compensating Controls

GPAI providers update their models on their own schedule, and these updates can change the model's behaviour in ways that affect the downstream system's compliance posture.

GPAI providers update their models on their own schedule, and these updates can change the model's behaviour in ways that affect the downstream system's compliance posture. Some providers apply changes within a version identifier without notification; a model accessed under the same name today may behave differently from the model accessed under that identifier three months ago.

Where the GPAI provider's API supports version pinning, requesting a specific model checkpoint rather than the latest version, the Technical SME pins the production system to a specific version. Version upgrades are treated as system changes that trigger the governance pipeline gates and the change classification logic. This approach ensures that no behavioural change enters the production system without passing through the full assessment process.

Sentinel monitoring provides continuous behavioural verification. The Technical SME maintains a sentinel monitoring pipeline that evaluates the GPAI model's behaviour against the sentinel dataset on a defined schedule, daily for high-risk systems. If the sentinel evaluation detects a behavioural shift exceeding defined tolerances, whether in output distribution, fairness metrics, or hallucination rate, an automated alert triggers investigation. The alert is documented in the post-market monitoring records in AISDP Module 12 and may trigger the substantial modification assessment. Because silent changes within a version identifier are possible, sentinel monitoring remains necessary even when version pinning is in place.

Contractual notification requirements form the third control. The procurement contract should require the GPAI provider to notify the downstream provider before making changes that could affect the model's behaviour, with a minimum notice period of 30 days recommended for high-risk systems. addresses the contractual mechanisms in the deployer relationship.

What security controls mitigate GPAI provider opacity?

Engineering Approach

The downstream provider cannot audit the GPAI provider's security infrastructure, training pipeline security, or model integrity controls.

The downstream provider cannot audit the GPAI provider's security infrastructure, training pipeline security, or model integrity controls. Article 15's resilience requirements must be satisfied through the downstream system's own security architecture, independent of the GPAI provider's internal security posture.

Prompt injection defence in depth is the primary security control for GPAI-based systems. The downstream system implements multiple layers of defence. Input sanitisation validates and constrains user inputs before they reach the GPAI model. System prompt isolation separates the system prompt from user content using the GPAI provider's recommended separation mechanisms. Output validation verifies that the GPAI model's output conforms to the expected format and content boundaries. For agentic systems, action gating requires explicit approval before the system acts on the GPAI model's output. These controls are documented in AISDP Module 9 with test evidence from the red-teaming programme.

Model integrity verification addresses the risk that the GPAI model itself may have been tampered with or replaced. Where the GPAI model is accessed via API, direct verification is not possible; the sentinel monitoring pipeline provides a behavioural proxy for integrity, since unexpected changes in the model's behaviour on the sentinel dataset may indicate an integrity issue as well as a version change. Where the GPAI model is deployed on-premises or in a private cloud, the Technical SME verifies model artefact integrity using cryptographic hashes at deployment time and periodically during operation.

How are compensating controls documented across the AISDP?

Regulatory Requirement

Each of the four dimensions maps to specific modules within the AI System Description Package.

Each of the four dimensions maps to specific modules within the AI System Description Package. Training data opacity controls, including sentinel dataset design and output distribution analysis, are documented in AISDP Module 4 with an explicit statement that Article 10 data documentation could not be conducted at the pre-training level. The fairness evaluation results cross-reference to Module 5.

Behavioural characterisation results are documented in AISDP Module 3 and inform the risk assessment in Module 6. Version stability controls, including version pinning decisions, sentinel monitoring schedules, and contractual notification arrangements, are documented in the post-market monitoring records in AISDP Module 12. Security controls, including prompt injection defences and model integrity verification, are documented in AISDP Module 9 with supporting evidence from the red-teaming programme.

This distributed documentation approach ensures that each compensating control is recorded alongside the requirement it addresses, rather than in a single standalone section. Auditors reviewing any individual module will find both the primary compliance evidence and, where the GPAI provider's disclosures were insufficient, the compensating control that fills the gap. The sentinel evaluation dataset itself is versioned and maintained as a controlled artefact, with re-evaluation required whenever the GPAI model version changes to ensure the proxy evidence remains current.

The Legal and Regulatory Advisor should review the compensating control documentation to confirm that each control explicitly states which regulatory requirement it addresses and why the GPAI provider's standard disclosures were insufficient to satisfy that requirement directly. This traceability from control to requirement is what distinguishes a genuine compensating control from ad hoc testing.

Compensating Controls When GPAI Provider Disclosures Are Insufficient

Written by

What is the compensating control framework for GPAI integration?

How do you address training data opacity?

How should teams characterise behavioural unpredictability?

Managing version instability in GPAI deployments

What security controls mitigate GPAI provider opacity?

How are compensating controls documented across the AISDP?

Frequently Asked Questions

Related Pages

In This Section

Build compliance into your pipeline