KLA Digital Logo
KLA Digital
EU AI ActMarch 12, 202614 min read

Post-Market Monitoring Plan for AI Agents: What EU AI Act Article 72 Requires

A practical guide to Article 72 post-market monitoring for high-risk AI systems and agentic workflows: what to monitor after deployment, what belongs in the plan, how it links to Annex IV, and when Article 73 incident reporting is triggered.

If your AI agent is a high-risk AI system under the EU AI Act, or forms part of one, Article 72 requires more than logs and dashboards. It requires a documented post-market monitoring system, based on a post-market monitoring plan, that keeps collecting and analysing relevant data throughout the system lifetime. For agentic workflows, that means watching not only model quality, but also tool execution, policy denials, escalation events, human overrides, integration failures, and rights-relevant outcomes. This guide reflects the current legal position on March 12, 2026. It is an implementation guide, not legal advice.

Scope

Article 72 applies when the agentic workflow is a high-risk AI system or forms part of one.

Core Obligation

Providers need a documented post-market monitoring system based on a reviewable plan inside Annex IV technical documentation.

Operational Focus

For agents, monitor tool use, approvals, overrides, integration failures, and rights-relevant outcomes rather than model quality alone.

Escalation Link

Monitoring thresholds should feed corrective action under Article 20 and serious-incident escalation under Article 73.
Diagram showing the Article 72 operating loop for high-risk AI agents: monitor, review, escalate, report, and feed evidence back into Annex IV documentation.
A workable Article 72 loop is operational, not decorative: collect real production signals, review them against thresholds, escalate incidents fast, and feed the outcome back into risk, change, and evidence records.

Start With Classification, Not Labels

AI agents are not a separate legal category under the AI Act. The trigger for Article 72 is not the word agent. The trigger is whether the system is classified as high-risk or forms part of a high-risk system.

An internal assistant that drafts meeting notes is not treated the same way as an agent used in credit underwriting, claims triage, hiring, identity verification, public-sector decision support, or other Annex III use cases. If you have not done the classification work yet, start with Classifying High-Risk AI Under the EU AI Act and AI Agent Compliance Under the EU AI Act. Once a system is in scope, post-market monitoring becomes part of the operating model, not an optional best practice.

What Article 72 Actually Requires

Article 72 requires providers of high-risk AI systems to establish and document a post-market monitoring system proportionate to the nature of the AI technology and the risks of the system. The monitoring has to remain active throughout the system lifetime.

In practice, that obligation has five moving parts. You need to actively and systematically collect, document, and analyse relevant post-deployment data; evaluate continuous compliance with the Chapter III Section 2 requirements; include interaction with other AI systems where relevant; base the monitoring system on a post-market monitoring plan; and keep that plan inside the Annex IV documentation stack. Existing sectoral post-market processes can be reused where Union law already requires equivalent governance, but only if the integrated result still achieves the same level of protection.

  • Collect relevant post-deployment data systematically, not ad hoc.
  • Review the data against ongoing compliance requirements, not only product KPIs.
  • Capture interactions with other AI systems and external tools where they matter.
  • Document the plan inside Annex IV so a reviewer can inspect how monitoring actually works.
  • Integrate with existing sectoral processes only where the protection level remains equivalent.

A Dashboard Is Not a Plan

The most common failure mode is confusing observability with compliance. A dashboard can tell you latency, success rate, cost, or token usage. A post-market monitoring plan should tell a reviewer which risks are being watched, which signals prove the controls still work, what thresholds trigger escalation, who reviews what, and how corrective action is documented.

That is why post-market monitoring is tightly linked to EU AI Act Article 17 Checklist, AI Agent Audit Trails: From Logs to Evidence, and your underlying evidence package. If you cannot show how telemetry becomes reviewed evidence, you do not yet have an Article 72 operating model.

  • Which risks are monitored: tied back to the Article 9 risk register.
  • Which signals matter: not just outputs, but tool calls, approvals, overrides, and downstream effects.
  • Which thresholds matter: what is informational, what needs same-day review, and what freezes production.
  • Which records survive audit: incidents, reviewer rationale, corrective actions, and version-linked evidence exports.

Why Article 72 Matters More for AI Agents

Static prediction systems can fail quietly. Agents fail across systems. They retrieve data, call tools, trigger workflows, hand work to other models, and operate with different levels of autonomy. That means monitoring for agents has to be workflow-level, not just model-level.

The logic behind post-market monitoring is especially relevant for agentic systems even when the model itself is not continuously retrained. Emerging risk can come from prompt changes, policy updates, model swaps, tool access expansion, retrieval changes, integration drift, or deployer-side workflow edits. The system can become riskier without a classic training cycle ever taking place.

  • Tool execution: failed actions, retries, side effects, and unauthorized attempts.
  • Policy control signals: denials, near-denials, approval queues, and override reasons.
  • Cross-system behavior: what happens when the agent relies on other AI services or external automations.
  • Rights-relevant outcomes: disparities, complaints, appeals, and materially harmful downstream effects.

What Belongs in a Defensible Article 72 Plan

A defensible plan is not long for the sake of looking serious. It is specific enough that a reviewer could understand how you detect degradation, who investigates anomalies, and how the evidence feeds back into conformity and change control.

For AI agents, the plan should usually include at least the following building blocks. This is also the point where many teams pair the legal requirement with the Post-market Monitoring Plan Template, Human Oversight Procedure Playbook, and Evidence Pack Checklist so the process is usable by engineering and operations teams, not just lawyers.

  • System identification and intended purpose: the exact system name, version, provider entity, release scope, deployment context, affected users, boundaries, and high-risk classification logic.
  • Monitoring objectives tied to the risk register: every material risk should have at least one monitored signal, one owner, and one escalation path.
  • Data sources and collection methods: execution logs, tool traces, approval events, override records, complaints, sampled outputs, incident tickets, and integration-change records.
  • Metrics, thresholds, and severity levels: task failure rate, tool error rate, hallucination or unsupported-action rate where relevant, override frequency, disparity indicators, rollback events, and threshold-triggered consequence logic.
  • Sampling policy and review cadence: always-on telemetry, 100% review for critical actions, baseline sampling for lower-risk outputs, and intensified review after releases, model swaps, or incidents.
  • Human oversight and intervention data: what was escalated, who reviewed it, what context they saw, what they changed, and whether the same failure pattern is repeating.
  • Incident handling and Article 73 linkage: who investigates, what becomes an incident, which authority deadlines apply, and how the reporting clock is identified and evidenced.
  • Corrective action and change control: who can pause, disable, roll back, or withdraw the system and how post-market findings feed back into Article 9 risk management and Article 17 quality management.
  • Evidence retention and audit readiness: how logs, reviewer notes, incidents, and corrective actions are retained, protected, linked to versions, and exported for regulatory review.

When Monitoring Becomes Article 73 Reporting

Your monitoring plan should say clearly when an alert becomes an incident, who owns triage, and when Article 73 reporting is triggered. That clock should not be left to improvisation.

Under the current Article 73 framework, the outside limit is 15 days after the provider or deployer becomes aware of a serious incident. The outside limit is 2 days for widespread infringement of obligations intended to protect fundamental rights or serious and irreversible disruption of critical infrastructure, and 10 days where a person has died once causality or reasonable suspicion of causality is established. The key operational point is simple: your thresholds and escalation rules need to surface those scenarios fast enough that reporting is still possible.

  • Death or serious harm to health requires the fastest triage and a clearly evidenced causality decision path.
  • Serious and irreversible disruption of critical infrastructure should bypass slow internal review queues.
  • Fundamental-rights infringements are not abstract legal issues; they need operational triggers tied to complaints, overrides, appeals, and harmful downstream outcomes.
  • Corrective action under Article 20 should be wired into the same workflow so mitigation does not wait for final reporting paperwork.

Providers Own Article 72, but Deployers Feed It

Article 72 is a provider obligation, but the provider often does not control the full operational context. Deployers may hold the complaint records, human-review notes, frontline incident tickets, or downstream outcome data that make monitoring meaningful.

For agentic systems, provider-deployer contracts should specify which data the deployer must return, how fast serious incidents and near-misses are escalated, how override records are retained, and how workflow or integration changes are communicated. If you do not design this interface, the plan looks complete on paper and fails in production.

  • Data return obligations for complaints, overrides, appeals, and outcome metrics.
  • Incident reporting channels with named contacts and jurisdiction mapping.
  • Change notifications when the deployer alters prompts, tools, workflows, or surrounding controls.
  • Retention and export rules so post-market evidence does not fragment across systems.

The March 2026 Template and Timeline Position

The safest current-law framing is this: the template question is unsettled; the provider obligation is not. The current Article 72 text still refers to a Commission implementing act for a plan template. But on November 19, 2025, the Commission proposed replacing that prescription with guidance rather than a harmonised plan format. That proposal is still under consideration, so it does not change the law in force today.

The same caution applies to the timeline. Under the current AI Act, most high-risk obligations apply on August 2, 2026, while certain product-embedded high-risk AI systems have a longer transition to August 2, 2027. The Commission has proposed timeline changes, but those changes are not yet law. The practical move is to build a reviewable plan now and adjust the format later if the Commission finalises template-related guidance or legislative amendments.

Common Mistakes That Make Article 72 Plans Fail Review

The failures are usually operational, not theoretical. Teams write the plan as if it were a memo, but regulators and auditors will treat it as evidence of a live operating capability.

If you want a practical review test, ask whether a third party could look at your plan and tell how the system is monitored, who investigates anomalies, how production changes affect sampling, and what gets reported or rolled back when thresholds are crossed.

  • Treating post-market monitoring as a dashboard instead of an operating procedure.
  • Monitoring only model quality while ignoring tool use, approvals, overrides, and downstream effects.
  • Listing metrics without thresholds, named owners, severity logic, or response times.
  • Forgetting interactions with other AI systems in agentic workflows.
  • Keeping raw logs but not incident decisions, reviewer rationale, or corrective-action evidence.
  • Disconnecting monitoring from change control so every release resets risk without tighter review.
  • Assuming the model vendor documentation replaces system-level monitoring by the provider.

Frequently Asked Questions

Does Article 72 apply to AI agents?

AI agents are not a separate legal category under the EU AI Act. Article 72 applies when the agentic system is a high-risk AI system, or part of one. The trigger is high-risk scope, not the marketing label agent.

What is the difference between a post-market monitoring system and a post-market monitoring plan?

The system is the live operating capability: the data collection, review, escalation, investigation, and corrective-action processes that run after deployment. The plan is the documented description of how that system works. Article 72 requires both.

What should a post-market monitoring plan include?

At minimum, it should define system scope, monitoring objectives, data sources, metrics, thresholds, sampling rules, review cadence, incident handling, corrective actions, and evidence retention. For AI agents, it should also address tool use, human approvals, overrides, and interactions with other AI systems.

What triggers Article 73 reporting?

Serious incidents under the AI Act include death or serious harm to health, serious and irreversible disruption of critical infrastructure, infringement of obligations under Union law intended to protect fundamental rights, and serious harm to property or the environment. Once the provider establishes a causal link, or a reasonable likelihood of one, the reporting obligation starts running.

Do we need the Commission template before we can comply?

No. The obligation to maintain a documented post-market monitoring system and plan exists independently of the template question. Build a defensible plan now from the legal text, your system boundaries, and your operating model, then adapt the format later if the Commission finalises guidance or amendments.

Can financial institutions and medical device teams reuse existing post-market processes?

Yes, where existing sectoral monitoring or internal governance processes integrate the necessary Article 72 elements and achieve an equivalent level of protection. The AI Act allows integration; it does not require duplicate paperwork for the same control objective.

When do the high-risk AI rules apply?

Under the current AI Act timeline, most high-risk AI rules apply on August 2, 2026, while certain product-embedded high-risk AI systems have a longer transition to August 2, 2027. The Commission has proposed timeline changes, but those proposals are still under consideration as of March 12, 2026.

Key Takeaways

Article 72 is not asking for a decorative compliance memo. It is asking providers of high-risk AI systems to run a real post-market monitoring capability and document it in a reviewable plan. For AI agents, that means watching not only model behavior but also tool execution, approval gates, overrides, integration risks, and rights-relevant outcomes. Teams that do this well treat post-market monitoring as the back half of risk management: production data comes in, issues are reviewed against thresholds, incidents can be escalated under Article 73, corrective actions are taken under Article 20, and the whole loop becomes evidence.

See It In Action

Ready to automate your compliance evidence?

Book a 20-minute demo to see how KLA helps you prove human oversight and export audit-ready Annex IV documentation.