Architecture and Retrieval

The patterns that distinguish production AI from demos.

2 May 20262 min read478 wordsArchitecture & Patterns

Cite as: The Applied Layer. (2026). Architecture and Retrieval. The Applied Layer. https://appliedlayer-ai.com/briefings/pillar-architecture-retrieval

Pillar 2, Architecture and Retrieval

The patterns that distinguish production AI from demos.

The question

What does it mean to “build with AI” once the demo works? The architectural decisions that hold up in production are not the ones a 30-second demo highlights. Retrieval beats prompting. Hybrid search beats pure vector. The integration shape, not the model name, decides whether the system survives a model swap.

Editorial thesis

Architecture is what survives the model swap. Retrieval is the central organising problem of the applied layer. Production patterns converge on hybrid retrieval (lexical plus dense plus reranker), evaluation harnesses sitting next to the LLM, and integration shapes that look more like ETL than like chatbots. The frontier models are converging on capability for the median enterprise workload; what separates a production-grade system from a demo is the architecture wrapped around the model.

Key findings (from the anchor research)

Hybrid retrieval (lexical plus dense plus reranker) consistently outperforms pure-vector retrieval on enterprise corpora.
Most production failures are retrieval failures, not generation failures. Naive embed-and-retrieve pipelines hover near 60 percent retrieval accuracy at scale.
Chunking strategy and query rewriting move the needle far more often than the embedding model. The embedding model is rarely the bottleneck once a sensible default is selected.
Agentic patterns sit on a five-tier maturity ladder. Tiers 1 and 2 (deterministic-with-LLM-glue, tool-using single agents) are broadly production-ready. Tiers 3 to 5 are workload-specific and frequently over-claimed.
The choice between architectural patterns can be made deterministically given a workload’s corpus characteristics, query complexity, latency budget, error tolerance, and verifiability surface.

What is filed under this pillar

Anchor research: “Production AI Architecture: Patterns That Distinguish Production from Demo”, the flagship survey of retrieval and agentic patterns.
Briefings on retrieval evaluation, hybrid search, agentic patterns (forthcoming).
Reference architectures (forthcoming).

[upgrade-prompt target=”member”] Become a Member, free in 60 seconds, to read the underlying research and briefings. [/upgrade-prompt]

Member view

The flagship Pillar 2 research, “Production AI Architecture: Patterns That Distinguish Production from Demo”, is the canonical anchor for this pillar. Members can read the full report, including the unified decision framework that closes Part C.

Cross-references and a citation export are available on the anchor research page. Briefings filed beneath this pillar walk through specific layers (chunking, hybrid retrieval, reranking, agentic tiers) as they are published.

[upgrade-prompt target=”patron”] Patron unlocks the methodology notes, the full bibliography with annotations, and primary research data. £15 per month. [/upgrade-prompt]

Patron view, methodology and primary data

The methodology note and full bibliography for the flagship architecture research live in the Patron-tier section of the anchor piece. The annotated bibliography is exportable as BibTeX. Primary research data, including the corpus-class characterisation underpinning the maturity taxonomy, ships with the Patron tier.

Patrons receive new pieces in this pillar 7 days before they go live for Members and 14 days before they go fully public.

Was this useful?

2 May 2026Research

Membership

Become a Member to receive new briefings as they are published.