We use cookies to improve your experience and analyse site traffic.
Deployers have their own monitoring obligations under Article 26, including monitoring system operation, reporting incidents, and suspending use when risk is identified. Providers must establish structured feedback channels, define contractual data sharing requirements, and implement compensating strategies where production visibility is limited. Synthetic monitoring and periodic audits bridge visibility gaps.
Deployers of high-risk systems have their own PMM obligations under Article 26.
Deployers of high-risk systems have their own PMM obligations under Article 26. They must monitor the system's operation using the provider's Instructions for Use, inform the provider of any serious incidents they become aware of, and suspend the system's use if they have reason to believe it presents a risk. Providers must establish deployer communication channels that enable deployers to report incidents, anomalies, and concerns efficiently, and PMM planning must specify how deployer feedback is received, triaged, and incorporated into the monitoring analysis.
The deployer's Article 26 monitoring obligation is only as effective as the guidance the provider supplies. The Instructions for Use under Article 13 must include sufficient operational monitoring guidance for the deployer to fulfil their obligation. This guidance should specify the minimum monitoring activities the deployer must perform: reviewing system outputs for consistency and plausibility, tracking human oversight metrics within the deployer's own operations, monitoring complaint and appeal rates from affected persons, and observing the system's behaviour for changes that might indicate degradation or drift.
Where the provider cannot perform direct monitoring because the deployer controls the production environment, the Instructions for Use should define the minimum data the deployer must collect and share with the provider. This might include aggregated performance statistics, anonymised samples of system outputs for quality assessment, human oversight metrics covering override rates and review times, and a summary of complaints and incidents observed at the deployer level. The Legal and Regulatory Advisor formalises these data sharing requirements in the deployment contract as binding obligations, not merely recommendations.
The provider should also supply clear criteria for when to suspend the system. Article 26(5) requires deployers to suspend use when they consider the system presents a risk, but deployers may lack the technical knowledge to assess risk without guidance. The Instructions for Use should provide specific, observable indicators that should trigger suspension: output patterns suggesting systematically biased results, a sudden change in the system's output distribution, error rates exceeding a defined threshold, or any indication that the system is being used for a purpose outside its documented intended use.
Deployer feedback is one of the most valuable PMM data sources because deployers observe the system's behaviour in real-world conditions that may differ significantly from the development and testing environment.
Deployer feedback is one of the most valuable PMM data sources because deployers observe the system's behaviour in real-world conditions that may differ significantly from the development and testing environment. Structured feedback channels must be established that make it easy for deployers to report issues.
A dedicated reporting portal or API endpoint enables deployers to submit incident reports, anomaly observations, and general feedback in a structured format. The portal should collect the deployer's identity, the affected system version, the date and time of the observation, a description of the observed behaviour, the expected behaviour, and any supporting evidence. Zendesk or ServiceNow provides an off-the-shelf alternative with SLA tracking, routing rules, and escalation procedures.
The Technical SME triages incoming deployer feedback within a defined timeframe. The AI Governance Lead defines the triage SLA: critical reports suggesting the system caused harm to an individual are triaged within 4 hours and escalated immediately to the incident response process; standard reports are triaged within 2 business days. The triage process classifies each piece of feedback by severity as potential serious incident, compliance concern, performance issue, feature request, or informational. The engineering team routes reports suggesting performance degradation to the PMM team for investigation. Feature requests and general feedback are logged by the Technical SME for product development consideration. The triage assigns each item to the appropriate internal team and acknowledges receipt to the deployer, confirming that the feedback has entered the triage process.
Many high-risk AI systems are deployed by third-party deployers who control the production environment.
Many high-risk AI systems are deployed by third-party deployers who control the production environment. The provider may have limited or no direct access to inference logs, operator behaviour data, or real-world outcomes. This creates a structural monitoring gap: the provider bears the PMM obligation under Article 72, but the deployer holds the data the provider needs to fulfil it.
The deployment contract should specify the minimum data the deployer must provide, the format and frequency of delivery, the quality standards the data must meet, and the consequences of non-provision. A deployer who signs a contract agreeing to share monthly aggregated performance statistics and then fails to deliver creates a compliance risk for the provider. The minimum contractual data package should include inference volumes broken down by request type where applicable, error rates and error type distributions, human oversight metrics covering override rates, escalation rates, and average review times, a summary of deployer-observed anomalies and complaints, and confirmation that the system is being used within its documented intended purpose. For systems where disaggregated fairness monitoring is required, the contract should address whether the deployer will provide demographic data or proxy indicators, subject to the data governance controls. The contract should include escalation provisions for data delivery failures and, in extreme cases, the provider's right to suspend the system until the data supply is restored.
Where the contractual framework establishes the data sharing obligation, the technical implementation determines whether the obligation is met in practice. Relying on the deployer to produce manual reports on a monthly schedule is fragile and slow. Where feasible, the provider should implement automated data pipelines that collect monitoring data from the deployer's environment with minimal manual intervention. Options include an agent or sidecar process deployed alongside the AI system in the deployer's infrastructure which streams anonymised monitoring telemetry back to the provider's PMM system, a callback API allowing the deployer's integration layer to send structured event data to the provider's monitoring endpoint after each inference, or periodic batch export where the deployer's system generates a structured monitoring data package in CSV, JSON, or Parquet format and uploads it to a secure provider endpoint on a defined schedule. The choice depends on the deployer's technical sophistication, their security and data governance constraints, and the monitoring latency the provider's PMM plan requires.
Even with contractual and technical mechanisms in place, the provider may not receive all the data it needs.
Even with contractual and technical mechanisms in place, the provider may not receive all the data it needs. Deployer data may be delayed, incomplete, or of variable quality. The PMM plan must define compensating strategies for operating under partial visibility.
Periodic audits where the provider's PMM team visits the deployer site or conducts a remote audit to verify monitoring data quality and completeness provide a direct check on the data pipeline's reliability. Annual monitoring audits, documented and retained as evidence, strengthen the provider's compliance posture even when day-to-day data flows are imperfect. Synthetic monitoring provides an independent performance check that does not depend on deployer-provided data, with test cases spanning the system's intended use cases and including edge cases relevant to the risk register.
Deployer satisfaction surveys conducted quarterly provide qualitative feedback on the system's real-world performance from the deployer's perspective. Survey results are a leading indicator: a decline in deployer satisfaction often precedes the formal incident reports that the structured feedback channel captures.
For organisations using procedural approaches, the monitoring capability is reduced but still achievable. Weekly, the sentinel dataset is submitted to the deployed system manually and outputs compared against the baseline, with results recorded in a spreadsheet. The AI Governance Lead requires deployers to submit structured monthly performance reports via email or a shared document template rather than automated telemetry. A deployer communication log provides a structured record of every communication to and from each deployer, with date, subject, content summary, and any action items. A structured feedback form in Word or PDF template is distributed to deployers for incident reports, performance concerns, and general feedback. Quarterly deployer review meetings or email exchanges for smaller deployments maintain the relationship. A deployer health spreadsheet tracks each deployer, their current system version, last communication date, open issues, and telemetry status.
Manual reporting is fragile and slow. Where feasible, automated data pipelines through telemetry agents or callback APIs are strongly preferred for timely and reliable monitoring.
The contract should include escalation provisions for data delivery failures and, in extreme cases, the provider's right to suspend the system until data supply is restored.
The provider submits known test cases through the production system at defined intervals and verifies outputs. It provides an independent performance check that requires no deployer cooperation.
Through structured reporting portals with defined triage timeframes, trend analysis across deployers, and closed feedback loops with investigation outcomes.
Individual deployer reports may appear minor in isolation, but patterns across multiple deployers can reveal systemic issues. The PMM team aggregates deployer feedback and analyses it for trends: recurring complaints about specific output types, clusters of anomaly reports from a particular deployment context, or gradual changes in deployer satisfaction metrics. These trends feed into the risk register and may trigger threshold recalibration or proactive investigation.
The communication channel must be bidirectional. The provider must also communicate to deployers: system updates and version changes with the timing, content, and impact of each change; PMM findings that may affect the deployer's use of the system; known limitations and emerging risks that have been identified through monitoring; and updated Instructions for Use when the system's operational guidance changes. These communications are structured and tracked by the AI Governance Lead with read receipts or acknowledgement requirements for critical communications, and retained as Module 11 evidence.
Closing the feedback loop is essential for sustaining deployer engagement. Deployers who report issues should receive confirmation that their feedback was received, a summary of the investigation outcome within the bounds of confidentiality, and notification of any corrective actions that affect them. A feedback loop that goes dark, where the deployer reports an issue and hears nothing, erodes trust and discourages future reporting, which in turn weakens the PMM system's detection capability. The provider should also monitor the health of each deployer relationship: are deployers using the telemetry pipeline, submitting feedback through structured channels, acknowledging critical communications, and updating to current system versions. Deployers who are non-responsive to communications, who have stopped submitting telemetry, or who are running significantly outdated versions represent a compliance risk requiring a defined escalation process culminating if necessary in contractual remedies or service suspension.
Three technical mechanisms bridge the visibility gap, and most deployments require all three. Telemetry agents are lightweight monitoring components that the provider packages alongside the model. An OpenTelemetry Collector sidecar or Fluent Bit forwarder runs in the deployer's environment, collects inference metadata including input distributions, output distributions, latency, and error rates in a structured format, and transmits it to the provider's monitoring infrastructure. The Technical SME designs the telemetry to minimise the data transmitted: distributional summaries and aggregate metrics rather than raw inference data, to respect the deployer's data sovereignty and minimise bandwidth. The telemetry schema, transmission frequency, and data handling terms are documented by the Legal and Regulatory Advisor in the deployer agreement.
Callback APIs provide a structured channel for the deployer to report events to the provider. Published webhook endpoints cover specific event types: performance degradation reports, incident notifications, user complaints, and ground truth feedback. Deployers call these endpoints when events occur. The API schema should be pre-defined, documented in the Instructions for Use, and include validation to ensure data quality. This mechanism depends on deployer cooperation, and the deployer agreement should include an obligation to use the callback APIs and a defined SLA for reporting.
Synthetic monitoring is the mechanism entirely within the provider's control. Sentinel test suites are maintained by the provider, submitting known inputs to the deployed system at defined intervals and verifying the outputs. This detects functional degradation, silent model changes if the deployer has modified the system, and availability problems. Synthetic monitoring cannot detect distributional drift in the real-world input population because the sentinel inputs are fixed, but it provides a baseline behavioural check that requires no deployer cooperation.
The PMM plan must document which mechanisms are used for each deployment, the coverage each mechanism provides, the residual monitoring gaps, and the mitigations for those gaps. Where the provider cannot achieve full PMM visibility, the aisdp must document the limitation and the compensating controls including more frequent sentinel testing, contractual obligations on the deployer, and periodic on-site audits.
Automated telemetry collection, SLA-tracked ticket routing, and real-time deployer monitoring are lost with the procedural approach. The manual approach depends on deployer cooperation and the governance team's follow-up discipline. OpenTelemetry Collector is open-source and provides automated telemetry at zero licence cost, making it the recommended first step beyond manual processes.