How much should organisations budget for post-market monitoring?

Between 15 and 25 percent of the system's annual development cost, covering personnel, infrastructure, testing, and incident response.

How should PMM data retention be structured under GDPR and the AI Act?

Tiered: individual-level data for 90 days, then aggregated to statistical summaries for the full ten-year retention period required by Article 18.

How does the PMM feedback loop translate findings into system improvements?

Findings pass through tiered decision authority, with threshold adjustments authorised by the Technical SME and architecture changes requiring AI Governance Lead approval.

How should PMM remediation be prioritised?

Through a separate backlog where critical actions override all engineering work, warning actions enter the next sprint, and informational findings are reviewed quarterly.

Can GDPR data minimisation conflict with AI Act monitoring obligations?

The two must be reconciled. Collect only what is necessary: where aggregated statistics suffice, anonymise at the point of collection. The lawful basis is typically legitimate interest supported by the AI Act's monitoring obligation.

How much personnel capacity does PMM require?

A medium-complexity high-risk system requires approximately a quarter to half FTE of dedicated PMM analytical effort, supplemented by engineering support during alert investigation.

What meta-metrics should track the feedback loop's effectiveness?

Time from finding to decision, time from decision to completed fix, share of findings resulting in system changes, and share of fixes that successfully resolve the originating finding.

Can GDPR data minimisation conflict with AI Act monitoring obligations?

The two must be reconciled. Collect only what is necessary: where aggregated statistics suffice, anonymise at the point of collection. The lawful basis is typically legitimate interest supported by the AI Act's monitoring obligation.

How much personnel capacity does PMM require?

A medium-complexity high-risk system requires approximately a quarter to half FTE of dedicated PMM analytical effort, supplemented by engineering support during alert investigation.

What meta-metrics should track the feedback loop's effectiveness?

Time from finding to decision, time from decision to completed fix, share of findings resulting in system changes, and share of fixes that successfully resolve the originating finding.

PMM for Systems with Limited Production Visibility

Written by

Michael Clark

Chief Executive Officer, Standard Intelligence

Founder and CEO of Standard Intelligence. Author of the Practitioners Implementation Guide series for EU AI Act compliance.

Martin Dean

Chief Technology Officer, Standard Intelligence

CTO of Standard Intelligence. Leads platform engineering and contributes to the PIG series technical content.

Post-market monitoring is an ongoing operational cost requiring 15 to 25 percent of the system's annual development budget. PMM data is subject to GDPR data minimisation and storage limitation, balanced against the AI Act's ten-year retention obligation through a tiered retention approach. The feedback loop translates monitoring findings into system improvements, with decision authority tiered by impact and a separate backlog preventing deprioritisation.

Abstract

Read abstract

Post-market monitoring resource planning must account for dedicated analytical personnel, infrastructure costs that grow over time, periodic re-validation testing, and incident response contingency. A rough planning estimate is 15 to 25 percent of annual development cost. PMM data containing personal information requires GDPR compliance, with legitimate interest as the typical lawful basis. A tiered retention approach retains individual-level data for 90 days before aggregating to statistical summaries for the ten-year period. The PMM feedback loop translates findings into system changes through a tiered decision authority: threshold adjustments by the Technical SME, retraining by the Technical Owner, architecture changes by the AI Governance Lead. PMM remediation requires a separate backlog with compliance-aligned prioritisation to prevent deprioritisation against business work. Feedback loop meta-metrics track response times and fix success rates.

What creates the limited-visibility monitoring challenge?

Regulatory Requirement

Many high-risk AI systems are deployed by third-party deployers who control the production environment.

Many high-risk AI systems are deployed by third-party deployers who control the production environment. The provider may have limited or no direct access to inference logs, operator behaviour data, or real-world outcomes. This creates a structural monitoring gap: the provider bears the PMM obligation under Article 72, but the deployer holds the data the provider needs to fulfil it.

Limited-visibility deployments occur when the provider ships the model as an API, a container, or an edge artefact, and the deployer integrates it into their own infrastructure. The provider cannot instrument the deployer's environment directly, cannot access the deployer's inference logs, and may not receive ground truth labels. Yet the provider retains the full PMM obligation under Article 72. Three categories of mechanism bridge this gap: contractual mechanisms that establish the obligation, technical mechanisms that automate data collection, and compensating strategies that address residual gaps where the first two mechanisms are insufficient.

What contractual mechanisms bridge the visibility gap?

Engineering Approach

The deployment contract should specify the minimum data that the deployer must provide, the format and frequency of delivery, the quality standards the data must meet, and the consequences of non-provision. A deployer who signs a contract agreeing to share monthly aggregated performance statistics and then fails to deliver creates a compliance risk for the provider. The contract should include escalation provisions for data delivery failures and, in extreme cases, the provider's right to suspend the system until the data supply is restored.

The minimum contractual data package should include inference volumes broken down by request type where applicable, error rates and error type distributions, human oversight metrics covering override rates, escalation rates, and average review times, a summary of deployer-observed anomalies and complaints, and confirmation that the system is being used within its documented intended purpose. For systems where disaggregated fairness monitoring is required, the contract should address whether the deployer will provide demographic data or proxy indicators, subject to the data governance controls. The Legal and Regulatory Advisor formalises these data sharing requirements in the deployment contract as binding obligations.

What technical data pipelines enable monitoring?

Engineering Approach

Where the contractual framework establishes the data sharing obligation, the technical implementation determines whether the obligation is met in practice.

Where the contractual framework establishes the data sharing obligation, the technical implementation determines whether the obligation is met in practice. Relying on the deployer to produce manual reports on a monthly schedule is fragile and slow: reports may be delayed, incomplete, or of variable quality, and the provider has no visibility between monthly submissions. Where feasible, the provider must implement automated data pipelines that collect monitoring data from the deployer's environment with minimal manual intervention. The choice of pipeline mechanism depends on the deployer's technical sophistication, their security and data governance constraints, and the monitoring latency the provider's PMM plan requires.

Three technical mechanisms bridge the visibility gap, and most deployments require all three. Telemetry agents are lightweight monitoring components that the provider packages alongside the model. An OpenTelemetry Collector sidecar or Fluent Bit forwarder runs in the deployer's environment, collects inference metadata including input distributions, output distributions, latency, and error rates in a structured format, and transmits it to the provider's monitoring infrastructure. The Technical SME designs the telemetry to minimise the data transmitted: distributional summaries and aggregate metrics rather than raw inference data, to respect the deployer's data sovereignty and minimise bandwidth. The telemetry schema, transmission frequency, and data handling terms are documented by the Legal and Regulatory Advisor in the deployer agreement.

Callback APIs provide a structured channel for the deployer to report events to the provider. Published webhook endpoints cover specific event types: performance degradation reports, incident notifications, user complaints, and ground truth feedback. Deployers call these endpoints when events occur. The API schema should be pre-defined, documented in the Instructions for Use, and include validation to ensure data quality. This mechanism depends on deployer cooperation, and the deployer agreement should include an obligation to use the callback APIs and a defined SLA for reporting.

How should incomplete deployer data be compensated?

Compensating Controls

Even with contractual and technical mechanisms in place, the provider may not receive all the data it needs.

Even with contractual and technical mechanisms in place, the provider may not receive all the data it needs. Deployer data may be delayed, incomplete, or of variable quality. The PMM plan must define compensating strategies for operating under partial visibility.

Periodic audits where the provider's PMM team visits the deployer site or conducts a remote audit to verify monitoring data quality and completeness provide a direct check on the data pipeline's reliability. Annual monitoring audits, documented and retained as evidence, strengthen the provider's compliance posture even when day-to-day data flows are imperfect.

Synthetic monitoring provides an independent performance check that does not depend on deployer-provided data. The test cases should span the system's intended use cases and include edge cases relevant to the risk register. Deployer satisfaction surveys conducted quarterly provide qualitative feedback on the system's real-world performance from the deployer's perspective. Survey results are a leading indicator: a decline in deployer satisfaction often precedes the formal incident reports that the structured feedback channel captures.

For organisations using procedural approaches, weekly sentinel testing can be conducted manually with results recorded in a spreadsheet, and deployers can submit structured monthly performance reports via email or shared document templates. Real-time visibility into deployer environments is lost with this approach, and deployer reporting depends on deployer compliance with the reporting obligation. OpenTelemetry Collector is open-source and provides automated telemetry at zero licence cost.

What PMM resource planning is required?

Compensating Controls

PMM is an ongoing operational cost that persists for the system's entire lifetime.

PMM is an ongoing operational cost that persists for the system's entire lifetime. Organisations that budget for development and deployment but not for sustained monitoring find themselves cutting corners on compliance obligations under Article 72.

Personnel requirements include dedicated analytical capacity. The PMM analyst or team reviews monitoring dashboards, investigates alerts, prepares PMM reports, and coordinates with the engineering team on remediation. For a medium-complexity high-risk system, a reasonable estimate is 0.25 to 0.5 FTE of dedicated PMM analytical effort, supplemented by engineering support during alert investigation and remediation.

Infrastructure costs include the monitoring stack covering data collection, storage, computation, alerting, and dashboards, with storage costs growing over time as monitoring data accumulates. Organisations should project these costs over the system's expected lifetime and factor in the ten-year retention obligation for compliance-relevant data.

Testing costs include periodic re-validation testing covering performance, fairness, and robustness at defined intervals, not only in response to alerts. Quarterly or biannual re-validation exercises require engineering effort and compute resources. For systems with limited production visibility, the testing budget must also account for the sentinel test suites and the periodic deployer audits that compensate for gaps in automated monitoring.

Incident response imposes unplanned costs that must be budgeted as a contingency. Serious incidents require engineering effort for investigation and remediation, legal effort for reporting to the competent authority and interaction with the market surveillance process, and operational effort for deployer communication and system recovery. A contingency budget for incident response ensures that the organisation can respond effectively without diverting resources from other critical activities, which is particularly important because serious incidents cannot be deferred or scheduled.

PMM for Systems with Limited Production Visibility

Written by

What creates the limited-visibility monitoring challenge?

What contractual mechanisms bridge the visibility gap?

What technical data pipelines enable monitoring?

How should incomplete deployer data be compensated?

What PMM resource planning is required?

Frequently Asked Questions

Related Pages

In This Section

Build compliance into your pipeline