What are the limitations of automated risk assessment?

Automated signals cannot evaluate contextual risks such as operational context changes, regulatory environment shifts, or vulnerabilities identified through horizon scanning. Human review is required.

Does every pipeline execution trigger the risk gate?

Only changes that modify the model, training data, feature set, or post-processing logic trigger the risk gate. Changes affecting only infrastructure, logging, or monitoring bypass it unless they alter the risk profile.

What happens when the risk gate identifies a new unregistered risk?

The gate fails. The risk register must be updated to include the new risk with its assessment and mitigation before the pipeline can proceed. The AI Governance Lead approves risk register updates.

What fairness metric shift threshold triggers a flag?

A shift exceeding the declared tolerance, typically 0.02 absolute change in selection rate ratio, triggers a flag for the risk delta computation. The specific tolerance is documented in the AISDP.

Does every pipeline execution trigger the risk gate?

Only changes that modify the model, training data, feature set, or post-processing logic trigger the risk gate. Changes affecting only infrastructure, logging, or monitoring bypass it unless they alter the risk profile.

What happens when the risk gate identifies a new unregistered risk?

The gate fails. The risk register must be updated to include the new risk with its assessment and mitigation before the pipeline can proceed. The AI Governance Lead approves risk register updates.

What fairness metric shift threshold triggers a flag?

A shift exceeding the declared tolerance, typically 0.02 absolute change in selection rate ratio, triggers a flag for the risk delta computation. The specific tolerance is documented in the AISDP.

Gate 2: Risk and Fundamental Rights Validation

Written by

Michael Clark

Chief Executive Officer, Standard Intelligence

Founder and CEO of Standard Intelligence. Author of the Practitioners Implementation Guide series for EU AI Act compliance.

Martin Dean

Chief Technology Officer, Standard Intelligence

CTO of Standard Intelligence. Leads platform engineering and contributes to the PIG series technical content.

The second governance gate evaluates whether the risk management system remains adequate for each proposed change. It computes a risk delta across four impact dimensions, flags changes with fundamental rights implications, and ensures residual risk stays within AISDP thresholds. This page covers the gate evaluation, FRIA trigger conditions, automated risk signals, and their limitations.

Abstract

Read abstract

The risk and fundamental rights gate fires after model evaluation and before staging deployment for every change that modifies the model, training data, features, or post-processing logic. It computes a risk delta across four dimensions: health and safety, fundamental rights, operational impact, and reputational impact. The gate passes if residual risk remains within AISDP Module 6 thresholds and fails if any dimension exceeds its threshold or a new unregistered risk is identified. Three conditions trigger Fundamental Rights Impact Assessment review: target population changes, feature set modifications involving protected characteristic proxies, and decision boundary shifts disproportionately affecting demographic subgroups. Four automated signals feed the computation: fairness metric shifts, disaggregated performance degradation, prediction distribution shift, and feature importance shift. These provide quantitative foundation but cannot replace human assessment of contextual risks such as operational context changes and regulatory environment shifts. The Risk Gate Record captures the complete risk delta, residual assessment, FRIA flag, and gate decision. Manual evaluation is feasible for infrequent changes but unsustainable for continuous retraining.

What does the risk and fundamental rights gate evaluate?

Engineering Approach

The second governance gate evaluates whether the risk management system remains adequate for the change being deployed, firing after model evaluation and before staging deployment.

The second governance gate evaluates whether the risk management system remains adequate for the change being deployed, firing after model evaluation and before staging deployment. Every pipeline execution that modifies the model, the training data, the feature set, or the post-processing logic triggers this gate. Changes that affect only infrastructure, logging, or monitoring configuration bypass the gate unless the change alters the system's risk profile.

The gate computes a risk delta: the difference between the system's current risk profile and the risk profile that would result from the proposed change. The delta is computed across all four impact dimensions described in the risk assessment framework: health and safety, fundamental rights, operational impact, and reputational impact. The gate passes if the residual risk after the proposed change remains within the acceptability thresholds documented in aisdp Module 6 risk register. The gate fails if any risk dimension exceeds its threshold or if a new risk has been identified that is not yet in the register.

The gate also evaluates whether the change has implications for the Fundamental Rights Impact Assessment. A change to the system's target population, a modification to the feature set that introduces or removes a proxy variable for a protected characteristic, or a shift in the model's decision boundary that disproportionately affects a demographic subgroup all require FRIA review. The gate flags these conditions for the Legal and Regulatory Advisor. The gate produces a Risk Gate Record containing the risk delta computation, the residual risk assessment, the FRIA review flag if triggered, and the gate decision. This record is deposited in the governance artefact registry and referenced by AISDP Module 6.

What automated signals feed the risk delta computation?

Engineering Approach

The risk delta computation is the analytical core of Gate 2, requiring two inputs: the current risk profile stored in the AISDP Module 6 risk register and the projected risk profile after the proposed change. The projected profile is derived from the model evaluation results produced by the engineering pipeline and from a structured impact assessment of the change. Four categories of automated signal feed the computation.

First, fairness metric shifts measure the difference between the current model's fairness metrics and the candidate model's metrics across all measured subgroups. A shift exceeding the declared tolerance, typically a change of more than two hundredths in absolute selection rate ratio, triggers a flag. Second, performance degradation in any subgroup is detected because aggregate performance may remain stable while a specific subgroup experiences degraded accuracy. The evaluation must disaggregate performance by protected characteristic before the risk gate can be evaluated.

Third, prediction distribution shift detects changes in the distribution of the model's output scores that may indicate a change in the system's effective decision boundary, even if aggregate metrics are unchanged. The Population Stability Index provides the quantitative measure, with thresholds defined in the risk register. Fourth, feature importance shift identifies material changes in the relative importance of features, particularly features that are proxies for protected characteristics. A feature importance shift warrants risk review even if the output metrics are stable, because it indicates the model is relying on different evidence for its decisions.

Who is accountable for risk gate decisions?

Engineering Approach

Four roles share accountability for the risk gate's operation, each contributing a distinct dimension of the assessment.

Four roles share accountability for the risk gate's operation, each contributing a distinct dimension of the assessment. The Technical SME provides the technical inputs for the risk delta computation, including the model evaluation results, the disaggregated performance metrics, and the feature importance analysis. The AI System Assessor evaluates the risk assessment as a whole, applying regulatory judgement to determine whether the quantitative signals and the qualitative context together indicate a change in the system's risk profile.

The Legal and Regulatory Advisor reviews FRIA implications when the gate flags a change that may affect fundamental rights. This review determines whether the change requires a FRIA update, whether stakeholder consultation is needed, and whether the market surveillance authority should be notified. The AI Governance Lead approves risk register updates, authorising changes to the documented risk profile that the gate has identified.

The gate decision follows a structured flow. If the change does not modify the model, data, features, or post-processing, Gate 2 is bypassed. If the residual risk is within thresholds, no new risks are identified, and no FRIA review is triggered, the gate passes. If a FRIA review is triggered, the Legal and Regulatory Advisor must approve before the gate can pass. If any threshold is breached, new unregistered risks are identified, or the FRIA review is not approved, the gate fails and the finding is recorded in the non-conformity register for remediation before deployment can proceed.

What is the procedural alternative for risk evaluation?

Compensating Controls

Without automated risk delta computation, the AI System Assessor conducts the evaluation manually after each model evaluation.

Without automated risk delta computation, the AI System Assessor conducts the evaluation manually after each model evaluation. The Assessor retrieves the current risk register from the AISDP, compares the candidate model's evaluation results against the documented thresholds, and records the comparison in a structured form. The form captures each risk dimension, the current score, the candidate score, whether the threshold is breached, and the Assessor's qualitative judgement on whether the change alters the risk profile.

The FRIA review is triggered by the Assessor's judgement rather than an automated flag. The Assessor applies the FRIA criteria to determine whether the change has fundamental rights implications and records the determination with supporting reasoning. The FRIA review determination and the risk delta assessment together form the manual equivalent of the automated gate record.

Manual risk evaluation is feasible for systems with infrequent changes, quarterly or less. For systems undergoing continuous retraining, the volume of evaluations makes manual assessment unsustainable without disproportionate resource commitment. The automated risk delta computation described above uses standard Python data structures and comparison logic; the implementation cost is modest compared to the ongoing cost of manual evaluation at scale.

Gate 2: Risk and Fundamental Rights Validation

Written by

What does the risk and fundamental rights gate evaluate?

What automated signals feed the risk delta computation?

Who is accountable for risk gate decisions?

What is the procedural alternative for risk evaluation?

Frequently Asked Questions

Related Pages

In This Section

Build compliance into your pipeline