The Applied Layer

The Applied Layer / Pillars

Trust, Evaluation & Governance

Evaluation as practice, governance as delivery — the two disciplines that decide whether enterprise AI earns operational trust

Pillar 5 of 5 · 2 pieces filed

Enterprise AI systems fail in production in characteristic ways. The pattern common to the documented failures reviewed here — Air Canada, Avianca, iTutorGroup, State Farm, DPD, Chevrolet of Watsonville — is the absence of two operational disciplines: evaluation (the ongoing measurement of whether systems perform as intended) and governance (the delivery of policy as code, controls, and accountable workflows).

Key findings

  • Central argument. Evaluation and governance are not parallel disciplines; they are the same operational system viewed from different angles. Evaluation gates are governance components; drift monitoring is governance evidence. The attempt to govern without operational evaluation produces documents; the attempt to evaluate without governance produces dashboards no one acts on.
  • Evaluation (Part A). Seven evaluation dimensions — correctness, faithfulness, relevance, safety, latency, cost, and business outcome — must all be measured. Five methods — golden datasets, LLM-as-judge, human-in-the-loop, online evaluation, and adversarial red-teaming — each address different dimensions and failure modes. Observability is the prerequisite for all of them.
  • Governance (Part B). Seven operational components constitute a working governance system. Three archetypes — Compliance-Led, Risk-Led, Engineering-Led — each succeed in some contexts and fail in others; the best operational pattern is Engineering-Led implementation, Risk-Led prioritisation, and Compliance-Led disclosure.
  • Maturity (Part C). A four-level combined maturity framework provides an operational yardstick. Most enterprise AI programmes in 2026 sit at Level 1 or low Level 2 — even organisations at Level 3–4 for traditional software delivery. The capability gap is real.

From the anchor research

Filed under Trust, Evaluation & Governance

2 pieces filed under this pillar. Patrons read the full body.

Trust, Evaluation & Governance, Pillar 5 of 5 · The Applied Layer