We use cookies to improve your experience and analyse site traffic.
Deployers managing multiple high-risk AI systems need a portfolio register, cross-system risk analysis, and quarterly executive reporting. Operator escalation guides, break-glass testing checklists, and seven key monitoring metrics with defined thresholds provide the operational framework for ongoing compliance.
Deployers operating multiple high-risk AI systems require portfolio-level governance that extends beyond managing each system individually.
Deployers operating multiple high-risk AI systems require portfolio-level governance that extends beyond managing each system individually. A portfolio register tracks all high-risk systems with their system identity and provider, risk classification and Annex III category, FRIA status, compliance record currency, monitoring status, and next review date. This register provides the AI Governance Lead with a single view of the organisation's AI compliance posture.
Cross-system risk analysis monitors for patterns across the portfolio that individual system monitoring would miss. Multiple systems from the same provider exhibiting similar issues may indicate a provider-level problem rather than system-specific deficiencies. Common data quality problems across several systems may point to a shared data source issue. Regulatory developments such as new guidance from the AI Office or enforcement actions against comparable systems may change the compliance posture of an entire category of systems simultaneously.
Quarterly portfolio reporting to executive leadership summarises the number and classification of systems in the portfolio, the aggregate compliance status across all systems, approaching review deadlines that require resource allocation, open issues requiring attention or escalation, resource constraints affecting the organisation's ability to maintain compliance, and regulatory developments that may affect the portfolio.
The eight-module deployer compliance record maps each deployer obligation to a discrete documentation area with defined review frequencies.
The eight-module deployer compliance record maps each deployer obligation to a discrete documentation area with defined review frequencies. Module D1 covers system identification and provider reference including the Declaration of Conformity receipt, reviewed at onboarding and on provider updates. Module D2 covers intended purpose and deployment context, reviewed quarterly and on context changes. Module D3 contains the Fundamental Rights Impact Assessment under Article 27, reviewed annually and on material changes.
Module D4 documents human oversight arrangements, training records, and break-glass procedures under Articles 26(2), 14, and 4, reviewed quarterly and on operator changes. Module D5 covers monitoring, incidents, and provider communications under Articles 26(4), 26(5), and 73, with monitoring reviewed monthly and incidents tracked continuously. Module D6 addresses data protection including the DPIA, lawful basis, and data subject rights under GDPR, reviewed annually and on processing changes. Module D7 covers EU database registration for public authority deployers under Article 49(3). Module D8 documents the review schedule, Article 25 reassessment, and version history, reviewed quarterly.
Operators at Level 2 of the oversight pyramid need a clear, simple decision framework for responding to unexpected system behaviour.
Operators at Level 2 of the oversight pyramid need a clear, simple decision framework for responding to unexpected system behaviour. If anyone is at immediate risk, the operator activates the break-glass procedure immediately without waiting for approval or further analysis.
If no one is at immediate risk, the operator assesses whether the observation is a single unusual output or a pattern. A single unexpected output should be documented, the operator should apply professional judgement, and the observation should be logged for monitoring. A pattern of unusual outputs requires further assessment: if the pattern suggests unfair treatment of a particular group, the operator escalates to their manager and the AI Governance Lead immediately. If the system appears to be performing poorly or behaving differently from its documented purpose without a fairness dimension, the operator escalates to their manager with documented examples. Observations that do not fit these categories are logged and monitored for recurrence.
Seven monitoring metrics provide the quantitative foundation for deployer-level oversight, each with threshold guidance and escalation paths through the oversight pyramid.
Seven monitoring metrics provide the quantitative foundation for deployer-level oversight, each with threshold guidance and escalation paths through the oversight pyramid.
Output distribution shift measured by Population Stability Index should trigger investigation above 0.10 and escalation above 0.25, routed from Level 1 to Level 3. Override rate should trigger investigation below 2 per cent for potential automation bias or above 30 per cent for potential calibration issues, escalating from Level 2 to Level 3. Review dwell time should trigger investigation when average time per case declines more than 20 per cent from the baseline. Complaint volume should trigger escalation on sustained increases above 50 per cent from Level 3 to Level 4.
Error rate should trigger escalation when the rate increases more than 2 percentage points, from Level 1 to Level 3. Subgroup divergence, measuring error rate differences across protected characteristic subgroups, should trigger escalation when any subgroup exceeds 1.5 times the aggregate rate, routing directly to Level 4. Calibration case performance measuring operator accuracy on known-answer cases should trigger retraining below 80 per cent and suspension below 60 per cent.
Break-glass procedures should be tested annually against a defined checklist: system halts within 30 seconds, pending cases are queued for manual decision rather than lost, provider and AI Governance Lead are notified within 5 minutes, the system remains halted until criteria are met and restart is authorised, and all records are documented in Module D4.
The break-glass test checklist should be executed regularly with results documented in Module D4. The checklist verifies system halt, case preservation, notification chains, and documentation completeness.
Override rates persistently below 2% suggest operators are accepting outputs without adequate scrutiny. Investigation should determine whether operators have the training, tools, and incentives to exercise genuine oversight.
Number and classification of systems, aggregate compliance status, approaching review deadlines, open issues, resource constraints, and regulatory developments affecting the portfolio.
Seven metrics: output distribution shift, override rate, review dwell time, complaint volume, error rate, subgroup divergence, and calibration case performance, each with defined escalation thresholds.