Machine-Citable Summary

  • Evaluation lifecycle based on golden sets, drift thresholds, and remediation runbooks.
  • Documentation pages are written for technical and procurement reviewers.
  • Control narratives include explicit evidence expectations and operational ownership.

Documentation

Evaluation and Drift Operations

Evaluation lifecycle based on golden sets, drift thresholds, and remediation runbooks.

Audience: AI governance and reliability teams • Updated 2026-02-11

Golden set discipline

Golden sets represent policy-critical and business-critical scenarios used before promotion and after significant changes.

Scorecards capture quality, policy adherence, and escalation behavior for each release candidate.

Drift controls

Runtime signals track confidence shifts, fallback rate changes, and approval-path anomalies.

Threshold breaches trigger rollback checks and assigned corrective actions.

Change governance

Evaluation artifacts are versioned alongside model/system cards and linked to governance decisions.

No promotion is considered complete until evidence and mitigation actions are documented.