Assurance Center
Assurance Center keeps governed agents trustworthy after launch: baselines, drift alerts, fairness checks, and tracked remediation in one loop.
Assurance Center is the continuous-quality surface of the KLA Control Plane, served at the route /measurement. The KLA Control Plane is a govern-in-place runtime safety, audit, and governance layer for enterprise AI agents: you instrument your existing agents instead of re-platforming them. Policies block what is clearly wrong at execution time, but most agent failures are quieter: answers that slowly get worse, formatting that degrades, outcomes that skew against one group of users. Assurance Center watches for that slow erosion in production and turns it into something you can see, prove, and fix. It delivers the Assure pillar of the product story (Govern. Operate. Assure. Prove.), providing independent verification that an agent still behaves the way it did the day you approved it.
Who uses it
Compliance, risk, and audit officers live here when they need to show that an automated decision system stays fair and accurate over time, not just at launch. Platform operators use it to catch quality regressions before users report them, and to confirm a new Rollout (a deployment of an agent Release) did not degrade behavior. Developers and integrators set the baselines and wire human feedback back into evaluation, closing the loop between what shipped and how it actually performs.
Why continuous assurance matters
An agent that passed every check at launch can still drift. Model providers update weights, your data distribution shifts, prompts get edited, and downstream tools change their output format. None of this trips a policy block, yet quality quietly falls. Governance does not end at deployment: a system you certified six months ago is only trustworthy if you can demonstrate it is still behaving today.
Key capabilities
Baselines. A baseline is a verified snapshot of correct agent behavior: a labeled set of expected outputs, quality scores, and outcome distributions captured from a Release you trust. Every later evaluation is measured against it, so "good" is defined by your own approved behavior rather than a vendor default.
Drift monitoring and Assurance Alerts. Assurance Center continuously scores live outputs against the baseline: semantic similarity, formatting validity, hallucination rate, and cost or latency movement. When a metric crosses its threshold, it raises an Assurance Alert (the canonical object for a drift issue). Alerts carry the affected agent, the metric that moved, the magnitude, and a link into Lineage Explorer to inspect the exact runs behind the regression. Open Assurance Alerts also surface in Command's Triage queue.
Bias and fairness cohorts. You define cohorts (groups such as age brackets, regions, or product tiers) and Assurance Center tracks how automated outcomes distribute across them. If a claims-triage agent starts approving one cohort at a materially different rate than another, that disparity becomes an Assurance Alert with the cohort breakdown attached, giving auditors concrete evidence of fairness monitoring.
Remediation Plans. Every alert can open a Remediation Plan: a tracked record of what is wrong, who owns it, and how the model, prompt, or policy boundary will be tuned to resolve it. The plan stays linked to its alert and to the runs that triggered it, so the full path from detection to fix is auditable and nothing is silently closed.
Human annotations. Reviewers and downstream QA can attach annotations (correct/incorrect labels, severity, and notes) to specific agent outputs. These human judgments feed directly back into the evaluation data store, sharpening future scoring and strengthening the next baseline.
The assurance loop
flowchart LR
B["Baseline"] --> M["Monitor live outputs"]
M --> D{"Within threshold?"}
D -->|yes| M
D -->|no| A["Assurance Alert"]
A --> R["Remediation Plan"]
R --> N["Human annotations"]
N --> BHow it connects
Assurance Center sits downstream of execution and upstream of evidence. It reads the same OpenTelemetry spans your agents emit, including GenAI attributes like genai.agent.name, genai.tool.name, genai.cost.usd, and genai.token.usage, and turns them into longitudinal quality signals. Assurance Alerts feed Command's Triage queue and the System Posture card. Drill-downs jump to Lineage Explorer for the underlying Lineage Records. And the verdicts recorded here (baselines held, fairness monitored, remediations closed) become part of the Sealed Evidence Bundles assembled in the Evidence Room, so "we kept watching" is itself provable.
