What must a post-market monitoring plan contain under the EU AI Act?

Data collection strategy, analysis methodology, threshold and trigger framework, escalation procedures, and a feedback loop into the risk management system.

What performance metrics should be monitored for high-risk AI systems?

AUC-ROC, F1, precision, recall, and calibration error, computed continuously and disaggregated by protected characteristic subgroups.

How should fairness be monitored in production AI systems?

Through selection rate ratios, equalised odds, predictive parity, and calibration within groups, computed at defined intervals with compensating strategies for missing demographic data.

What operational metrics require continuous monitoring?

Availability, inference latency (including tail percentiles), error rates classified by type, resource utilisation, and dependency health.

Can aggregate performance metrics alone satisfy PMM requirements?

No. Aggregate metrics can mask subgroup-specific degradation. All metrics must be disaggregated across protected characteristics where data is lawfully available.

What should organisations do when demographic data is unavailable for fairness monitoring?

Use compensating strategies including proxy-based estimation, periodic deployer surveys, external benchmark comparison, and structured feedback analysis examining complaint patterns.

How should PMM plans handle the difference between operational and model incidents?

The plan must define diagnostic procedures for common alert patterns, distinguishing infrastructure issues from model degradation, as the response path and regulatory implications differ.

Can aggregate performance metrics alone satisfy PMM requirements?

No. Aggregate metrics can mask subgroup-specific degradation. All metrics must be disaggregated across protected characteristics where data is lawfully available.

What should organisations do when demographic data is unavailable for fairness monitoring?

Use compensating strategies including proxy-based estimation, periodic deployer surveys, external benchmark comparison, and structured feedback analysis examining complaint patterns.

How should PMM plans handle the difference between operational and model incidents?

The plan must define diagnostic procedures for common alert patterns, distinguishing infrastructure issues from model degradation, as the response path and regulatory implications differ.

Article 72(3) requires a documented post-market monitoring plan covering data collection, analysis methodology, threshold framework, escalation procedures, and feedback loops. Effective PMM requires continuous monitoring of performance metrics disaggregated by protected characteristics, fairness indicators, data drift signals, and operational health. The plan connects monitoring infrastructure to risk management, model development, and the conformity assessment record.

Abstract

Read abstract

The EU AI Act requires providers of high-risk AI systems to implement post-market monitoring under Article 72, with a documented plan forming part of the technical documentation. The plan must define data collection strategies, analysis methodologies, threshold frameworks, escalation procedures, and feedback loops. Performance monitoring requires continuous computation of accuracy metrics disaggregated by protected subgroups, with temporal stability tracking to detect cumulative degradation. Fairness monitoring computes selection rate ratios, equalised odds, and predictive parity at defined intervals, with compensating strategies for missing demographic data. Data drift monitoring covers input distribution shifts, concept drift where ground truth is available, and individual feature drift. Operational monitoring tracks availability, latency, error rates, resource utilisation, and dependency health, with defined procedures for distinguishing operational incidents from model degradation.

What does Article 72 require for the PMM plan?

Regulatory Requirement

Article 72(3) requires a documented post-market monitoring plan as part of the technical documentation under Annex IV.

Article 72(3) requires a documented post-market monitoring plan as part of the technical documentation under Annex IV. The plan is not optional supplementary material but a mandatory component of the aisdp that competent authorities will review during inspections and that the internal conformity assessment must verify. The plan must define five components: the data collection strategy, the analysis methodology, the threshold and trigger framework, the escalation procedures, and the feedback loop connecting monitoring findings to the risk management system and the development cycle.

What must the data collection strategy define?

Engineering Approach

The data collection strategy specifies what data is collected from the system in production, from which sources it is gathered, and at what frequency collection occurs.

The data collection strategy specifies what data is collected from the system in production, from which sources it is gathered, and at what frequency collection occurs. The strategy must cover all five monitoring dimensions: performance metrics, fairness metrics, data drift indicators, operational health metrics, and human oversight metrics. For each data type, the plan documents the collection mechanism, whether automated telemetry, deployer reporting, or periodic sampling, and the frequency at which data is collected and analysed.

What analysis methodology and thresholds are required?

Engineering Approach

The analysis methodology specifies how the collected data is analysed, what metrics are computed from the raw data, and what statistical tests are applied to detect meaningful changes.

The analysis methodology specifies how the collected data is analysed, what metrics are computed from the raw data, and what statistical tests are applied to detect meaningful changes. The methodology must produce reproducible, auditable results: running the same analysis on the same data must produce the same conclusion.

The threshold and trigger framework defines what constitutes normal variation versus an alert condition for each monitored metric. Thresholds must balance sensitivity against operational burden: thresholds set too tightly generate excessive false alerts causing alert fatigue, while thresholds set too loosely allow compliance-relevant changes to pass undetected. The alerting and escalation framework describes the three-tier severity structure that operationalises these thresholds.

How should escalation and feedback be structured?

Engineering Approach

The escalation procedures define who is notified when a threshold is breached, how quickly the notification occurs, and what actions the recipient is expected to take.

The escalation procedures define who is notified when a threshold is breached, how quickly the notification occurs, and what actions the recipient is expected to take. The procedures must account for different severity levels, out-of-hours scenarios, and the availability of named alternates for every role in the escalation chain.

The feedback loop defines how PMM findings are integrated into three areas: the risk management system where monitoring data updates the risk register and may trigger risk re-assessment, the aisdp where monitoring evidence is incorporated into the relevant modules, and the system's development cycle where monitoring findings inform model retraining, feature engineering changes, or threshold recalibration. The feedback loop closes the gap between monitoring and action, ensuring that findings translate into system improvements rather than accumulating in dashboards.

The Post-Market Monitoring Plan: Structure and Requirements

Written by

What does Article 72 require for the PMM plan?

What must the data collection strategy define?

What analysis methodology and thresholds are required?

How should escalation and feedback be structured?

Frequently Asked Questions

Related Pages

In This Section

Build compliance into your pipeline