We use cookies to improve your experience and analyse site traffic.
Article 72(3) requires a documented post-market monitoring plan covering data collection, analysis methodology, threshold framework, escalation procedures, and feedback loops. Effective PMM requires continuous monitoring of performance metrics disaggregated by protected characteristics, fairness indicators, data drift signals, and operational health. The plan connects monitoring infrastructure to risk management, model development, and the conformity assessment record.
Article 72(3) requires a documented post-market monitoring plan as part of the technical documentation under Annex IV.
Article 72(3) requires a documented post-market monitoring plan as part of the technical documentation under Annex IV. The plan is not optional supplementary material but a mandatory component of the aisdp that competent authorities will review during inspections and that the internal conformity assessment must verify. The plan must define five components: the data collection strategy, the analysis methodology, the threshold and trigger framework, the escalation procedures, and the feedback loop connecting monitoring findings to the risk management system and the development cycle.
The data collection strategy specifies what data is collected from the system in production, from which sources it is gathered, and at what frequency collection occurs.
The data collection strategy specifies what data is collected from the system in production, from which sources it is gathered, and at what frequency collection occurs. The strategy must cover all five monitoring dimensions: performance metrics, fairness metrics, data drift indicators, operational health metrics, and human oversight metrics. For each data type, the plan documents the collection mechanism, whether automated telemetry, deployer reporting, or periodic sampling, and the frequency at which data is collected and analysed.
The analysis methodology specifies how the collected data is analysed, what metrics are computed from the raw data, and what statistical tests are applied to detect meaningful changes.
The analysis methodology specifies how the collected data is analysed, what metrics are computed from the raw data, and what statistical tests are applied to detect meaningful changes. The methodology must produce reproducible, auditable results: running the same analysis on the same data must produce the same conclusion.
The threshold and trigger framework defines what constitutes normal variation versus an alert condition for each monitored metric. Thresholds must balance sensitivity against operational burden: thresholds set too tightly generate excessive false alerts causing alert fatigue, while thresholds set too loosely allow compliance-relevant changes to pass undetected. The alerting and escalation framework describes the three-tier severity structure that operationalises these thresholds.
The escalation procedures define who is notified when a threshold is breached, how quickly the notification occurs, and what actions the recipient is expected to take.
The escalation procedures define who is notified when a threshold is breached, how quickly the notification occurs, and what actions the recipient is expected to take. The procedures must account for different severity levels, out-of-hours scenarios, and the availability of named alternates for every role in the escalation chain.
The feedback loop defines how PMM findings are integrated into three areas: the risk management system where monitoring data updates the risk register and may trigger risk re-assessment, the aisdp where monitoring evidence is incorporated into the relevant modules, and the system's development cycle where monitoring findings inform model retraining, feature engineering changes, or threshold recalibration. The feedback loop closes the gap between monitoring and action, ensuring that findings translate into system improvements rather than accumulating in dashboards.
No. Aggregate metrics can mask subgroup-specific degradation. All metrics must be disaggregated across protected characteristics where data is lawfully available.
Use compensating strategies including proxy-based estimation, periodic deployer surveys, external benchmark comparison, and structured feedback analysis examining complaint patterns.
The plan must define diagnostic procedures for common alert patterns, distinguishing infrastructure issues from model degradation, as the response path and regulatory implications differ.
Availability, inference latency (including tail percentiles), error rates classified by type, resource utilisation, and dependency health.