Operating Models & What Success Looks Like

Two enterprises with comparable AI ambition, similar vendor stacks, and similar talent pools routinely produce materially different results. The divergence almost never traces to the technology choice. It traces to the operating model, the design authority, build capacity, governance regime, run mo

2 May 202635 min read8,600 wordsOperating Models

Cite as: The Applied Layer. (2026). Operating Models & What Success Looks Like. The Applied Layer. https://appliedlayer-ai.com/research/operating-models-and-success

Engraving plate, public domain (cover for Operating Models & What Success Looks Like). — Engraving plate, public domain. See manifest for source attribution.

Executive summary

This report, Pillar 3 of The Applied Layer, argues four claims. First, operating model dominates technology choice as the determinant of enterprise AI outcomes. Second, the population of operating models in use clusters into four archetypes on a 2x2 grid defined by centralization (centralized vs. federated) and orientation (platform-led vs. delivery-led): Centralized Platform, Centralized Delivery, Federated Platform, and Federated Delivery / CoE. Third, six conditions of success account for most of the variance among healthy programs: production reach, evaluation in production, integration to systems of record, governance integrated with delivery, talent retained around an applied-layer practice, and stable executive sponsorship. Fourth, each archetype makes some conditions easier and others harder; the interactions are predictable enough to guide intervention.

The report grounds the framework in primary sources, JPMorganChase, Sanofi, Walmart, Morgan Stanley, Goldman Sachs, Uber, Lyft, Spotify, Netflix, LinkedIn, DBS, Bloomberg, and the Klarna reversal, and in the public retrospectives of stalled programs at McDonald’s, Air Canada, Zillow, and Klarna’s customer-service automation. The synthesis figure, the operating-model × success-condition interaction matrix, is the report’s original framework contribution.

Executive Summary

Part A, Operating-Model Archetypes

1. Why operating model is the dominant variable

Consider two banks circa 2024-2026. JPMorganChase rolled its proprietary LLM Suite to roughly 200,000 employees inside eight months of release in summer 2024, with about half using it daily by mid-2025; the bank attributes $1.5-2 billion in measurable annual value to its AI portfolio and has placed Chief Data & Analytics Officer Teresa Heitsenrether on the firm’s Operating Committee, reporting to CEO Jamie Dimon and President Daniel Pinto.¹ ² ³ Most large banks in the same period announced AI strategies of comparable ambition; few moved comparable volumes of work into production over a comparable window. The gap is not explained by foundation-model access (every major bank can buy GPT-4-class capacity), nor by talent (the top firms compete in the same pool), nor by cloud spend.

What differs is the operating model. JPMC made three structural choices that the public record makes legible: it elevated a single accountable executive to the operating committee; it built an abstraction layer (LLM Suite) that lets the firm swap models behind a stable employee-facing surface and bind that surface to internal data; and it pushed a deliberate “democratization” rollout rather than a curated pilot.² ³ Each is an operating-model decision. Each precedes any vendor selection.

The contrast case is McDonald’s automated order-taking pilot with IBM, terminated in June 2024 after roughly three years and more than 100 restaurants of testing.⁴ ⁵ The technology was real; competitors continued to deploy adjacent systems (Wendy’s, Dunkin’, White Castle).⁴ What McDonald’s lacked, the public record suggests, was an operating model that owned end-to-end production responsibility for a customer-facing AI surface, error rates that surfaced in viral video (bacon ice cream, hundreds of McNuggets) overwhelmed any internal escalation path before remediation could catch up.⁶ ⁵

This pillar’s central claim follows: two organizations with comparable ambition, talent, and vendor access produce divergent AI outcomes because their operating models differ. A successful enterprise AI program is recognizable not by its model choice but by the consistency of six conditions described in Part B. The remainder of Part A names the four archetypes inside which those conditions are easier or harder to meet.

2. What “operating model” means here

We use the term in the sense Ross, Weill and Robertson popularized in Enterprise Architecture as Strategy (Harvard Business Review Press, 2006), the operating model as the necessary level of business-process integration and standardization for delivering goods and services, rather than in the looser consultancy “target operating model” sense.⁷ An AI operating model has five operational components:

Design authority. Who decides what gets built, against which standards, and who can say no.
Build capacity. Where the engineers, scientists, and product staff sit; how they are funded; how their work is prioritized.
Governance regime. How risk, compliance, model-risk-management, and policy review attach to delivery.
Run model. Who operates production AI: SLAs, on-call, eval cadence, change control, integration with systems of record.
Funding flow. Whether AI investment runs through a central P&L, business-unit P&Ls, or a hybrid; whether platforms are funded as overhead or chargeback.

This is distinct from the org chart. Two firms with identical reporting lines can run different operating models if their funding flows or governance regimes differ. TOGAF’s Architecture Development Method is one established practice for thinking about target operating models in enterprise architecture; we reference it as practice context, not as load-bearing evidence.⁸

What an operating model is not: a strategy deck, a maturity curve, or a list of platforms procured. The test is operational. Walk into any team building an AI capability and ask: who decided this should exist, who pays for it, who owns its incidents, who reviews it before launch, and who decides when to retire it? Five answers; one operating model.

3. The 2x2 framework

Two axes capture most of the variation observed across published enterprise AI programs:

Axis 1, Centralization. Centralized models concentrate AI build capacity, design authority, and platform investment in one organization reporting to a single executive. Federated models distribute build capacity into business units, divisions, or product teams, with a smaller central group setting standards, providing shared services, or both.

Axis 2, Orientation. Platform-led models invest first in shared infrastructure, feature stores, model registries, eval frameworks, governance pipelines, and treat delivery teams as users of the platform. Delivery-led models invest first in shipping use cases against business problems and treat platform work as consequent to delivery demand.

The four cells are operationally distinct.

Figure 1, The Operating-Model 2x2

	Platform-led orientation	Delivery-led orientation
Centralized capacity	Centralized Platform. One central team builds shared platforms; product teams consume them. Examples: Uber Michelangelo; Netflix Metaflow + Maestro; Walmart Element; LinkedIn agent platform; Spotify Hendrix.	Centralized Delivery. One central team owns both platforms and the delivery of named flagship products. Examples: Bloomberg AI Group / BloombergGPT; Morgan Stanley Firmwide AI (AI @ Morgan Stanley Assistant, Debrief, AskResearchGPT); JPMorganChase LLM Suite under the firmwide CDAO.
Federated capacity	Federated Platform. Distributed delivery teams in business units; central team provides standards, shared platforms, and governance rails. Examples: DBS “data chapter” (≈700 data professionals embedded in units, central skilling and governance); Capital One’s hybrid data governance; Sanofi Digital Accelerators network.	Federated Delivery / CoE. Distributed delivery teams; central Center of Excellence provides advisory, training, and pattern reuse rather than platforms. Examples: Goldman Sachs GS AI Platform with line-of-business delivery; Sanofi AI Research Factory + line-of-business delivery; many large pharma and insurance CoEs documented in trade press.

The archetypes are not maturity levels. Each is a coherent answer to a particular set of constraints. Federated Platform suits firms whose business units have material domain difference; Centralized Platform suits firms whose use cases share infrastructure needs; Centralized Delivery suits firms whose competitive advantage rests on a small number of high-stakes flagship surfaces; Federated Delivery / CoE suits firms whose AI work needs proximity to specialist domain teams that resist central platform constraints.

The remaining sections in Part A describe each archetype operationally, with named examples, what each does well, and where each tends to struggle.

4. Centralized Platform

In a Centralized Platform model, AI capacity is concentrated in one organization that builds and operates shared infrastructure, feature stores, training compute, model registries, deployment systems, online evaluation tooling, governance pipelines. Product teams across the firm consume the platform and own the use-cases built on top of it.

The clearest published exemplar is Uber’s Michelangelo, introduced in 2015-2017 and continuously evolved through three explicit phases: predictive ML on tabular data (2016-2019), deep learning (2019-2023), and generative AI (2023-).⁹ ¹⁰ By 2024 the platform managed roughly 400 active ML projects, 20,000 monthly training jobs, more than 5,000 production models, and 10 million real-time predictions per second at peak.⁹ Netflix’s Metaflow, originally built on Titus and now hybrid with AWS, supports more than 3,000 AI/ML projects and tens of petabytes of artifacts at Netflix alone.¹¹ ¹² Spotify’s Hendrix platform supports more than 600 internal ML practitioners as of the QCon 2023 disclosures.¹³ LinkedIn’s agent platform formalized AI agents as gRPC microservices and built a unified GenAI application stack underneath the Hiring Assistant.¹⁴ ¹⁵ Walmart’s Element platform centralizes ML capabilities to “experiment with AI models without having to worry if they would fit into a specific vendor or cloud provider,” in CTO Hari Vasudev’s account, and now extends into agent infrastructure (WIBEY).¹⁶ ¹⁷

What this archetype does well. Production reach. Centralized Platform organizations consistently get more work into production because the platform amortizes the integration cost: identity, data pipelines, deployment, and observability are solved once and reused. Governance integration is also easier, policies bake into platform stages (Spotify built Hendrix with traceability “at its core”; LinkedIn’s eval-driven approach pairs AI-judges with human-validated data).¹⁴ ¹⁸ Reproducibility, version control, and rollback discipline emerge naturally from platform investment.

Where it struggles. Two recurring patterns. First, the platform team can become an order-taker rather than a partner, the “platform tax” complaint, where business teams perceive the central team as slow, opinionated, or insufficiently fluent in their domain. Lyft’s 2025 rearchitecture of LyftLearn, moving offline training to AWS SageMaker while retaining custom Kubernetes for serving, is a candid acknowledgement that custom platform investment had become a tax that didn’t pay back in operational complexity.¹⁹ Second, talent retention can be brittle: the strongest platform engineers are highly portable, and a platform team that loses its tech lead often loses momentum for quarters. Pinterest’s published 2014-2025 ML-platform retrospective describes five eras of repeated platform rebuild forced by underlying modeling shifts.²⁰

The pattern suggests Centralized Platform rewards firms whose use cases are infrastructure-similar enough to share platform investment and whose central team has the standing to be a partner rather than a gate.

5. Centralized Delivery

In Centralized Delivery, capacity is concentrated in one organization but the focus is delivery of named, flagship products rather than platforms. The central team builds, ships, and operates the AI surfaces; platforms are emergent and serve those products.

Bloomberg’s AI Group, which built and published BloombergGPT (a 50-billion-parameter model trained on a 363-billion-token financial corpus augmented by 345 billion general-purpose tokens), is a Centralized Delivery exemplar in the publishing-and-data segment.²¹ ²² Morgan Stanley’s Firmwide AI organization, led by Jeff McMillan, owns the AI @ Morgan Stanley Assistant (rolled out across wealth management financial advisors in September 2023, reaching documented adoption above 98% of advisor teams), AI @ Morgan Stanley Debrief (launched June 2024), and AskResearchGPT (rolled out to institutional securities staff from summer 2024).²³ ²⁴ ²⁵ JPMorganChase’s LLM Suite, built in-house under CDAO Heitsenrether and Chief Analytics Officer Derek Waldron, scaled from zero to roughly 200,000 onboarded users in eight months and updated on an eight-week cadence, is a hybrid: it is platform-shaped, but the operating reality is centralized delivery of one flagship surface to which the rest of the bank consumes connections.² ²⁶

What it does well. Speed to flagship. When a single accountable team owns the use case end-to-end, decision latency collapses. Morgan Stanley’s documented eval framework, evolving from summarization evals to translation evals to retrieval-tuning collaboration with OpenAI, is the kind of iterative discipline that is harder to maintain across distributed teams.²³ Centralized Delivery is also strong on stable executive sponsorship: a flagship product creates a clear accountability chain to the C-suite.

Where it struggles. Coverage and reuse. Centralized Delivery teams ship excellent flagship surfaces but often fail to seed reusable platform capability for the next wave of use cases. Bloomberg’s own published candor about BloombergGPT, including training instabilities at steps 115,500, 129,900, and 137,100 documented in the paper’s Training Chronicles appendix, and the fact that GPT-4 subsequently matched BloombergGPT’s domain performance without finance-specific pretraining, captures the strategic risk: a flagship investment can be obsolesced by a moving frontier.²² Centralized Delivery also concentrates key-person risk and can starve business units that fall outside the flagship scope.

6. Federated Platform

In Federated Platform, build capacity is distributed across business units, divisions, or domains. A central group provides shared platforms, standards, and governance rails, but does not deliver the use cases.

DBS Bank in Singapore is the clearest documented exemplar. Its “data chapter” model brings together approximately 700 data professionals, including roughly 200-250 data scientists per the Group CIO and CDO’s published accounts, who sit inside business units for day-to-day work but belong to a central organization responsible for skilling, governance, and reusable model libraries.²⁷ ²⁸ ²⁹ DBS reports a library of approximately 1,500 models, time-to-deploy reduced from 18 months to 2-3 months, more than 370 use cases, and around S$750 million in 2024 economic value attributed to data and AI.²⁸ ²⁹ Capital One operates a similar federated data-governance practice, central platforms and policy, federated execution embedded in lines of business, with what CDO Salim Syed’s team calls “sloped governance” calibrated to data sensitivity.³⁰ Sanofi’s network of Digital Accelerators, plus the AI Research Factory and the AI-powered Modulus manufacturing facilities expected operational in 2026, fits the same shape: shared platforms (Solvify for solubility prediction, SimplY for yield analytics, the plai decision app built with Aily Labs) anchored by a Chief Digital function with embedded delivery in R&D, manufacturing & supply, and commercial operations.³¹ ³² ³³

What it does well. Stakeholder alignment and integration to systems of record. Because delivery teams sit inside business units, they understand the systems where work actually happens. DBS’s 100+ AI/ML algorithms generating roughly 30 million hyper-personalized “nudges” per month in Singapore work because the modelers sit close to the customer-facing systems.²⁹ Federated Platform also distributes talent risk: losing one BU’s lead does not stop another’s program.

Where it struggles. Standards drift and platform tax disputes. Federated Platform requires the central team to enforce standards it does not own delivery of, which creates friction when business-unit timelines diverge from platform releases. Governance can be uneven across units (Capital One’s published “sloped governance” is a deliberate response to exactly this problem).³⁰ When the central platform team underinvests, business units fork; when it overreaches, they route around it. The model only works when central authority is real and central service is good.

7. Federated Delivery / CoE

In Federated Delivery, build capacity sits in business units and a Center of Excellence provides advisory services, training, pattern reuse, and limited shared services, but not a single shared production platform. Use cases ship from BU teams using whatever tooling fits.

Goldman Sachs’ GS AI Platform under CIO Marco Argenti is a documented hybrid: a firmwide platform that hosts multiple foundation models (OpenAI, Anthropic, Google, Meta) behind a unified interface, with delivery owned in lines of business. Goldman has rolled the GS AI Assistant to its bankers and traders for specific high-value workflows; CEO David Solomon has stated public adoption goals.³⁴ ³⁵ Sanofi’s AI Research Factory and discovery teams operate against a federated delivery pattern in R&D, distinct from the platform-shaped manufacturing accelerators.³¹ Many large pharmaceutical, insurance, and industrial firms run a CoE pattern documented in trade and engineering press, often combined with a chief AI officer or chief data and analytics officer in a coordinator role.³⁶

What it does well. Talent retention and stakeholder alignment. Practitioners sit close to domain experts and to a CoE community of practice they want to belong to. The CoE provides career growth, peer review, and conference visibility without the constraints of a central platform team. The model also matches the structure of firms whose business units have meaningful regulatory or domain difference (Sanofi’s R&D vs. Manufacturing & Supply; a large bank’s retail vs. asset management vs. markets).³²

Where it struggles. Governance integration and production reach. Without a shared platform, governance is harder to embed at the pipeline level, it tends to be appended at the end, where it kills programs late. The McKinsey/MIT Sloan account of DBS makes the point indirectly by showing how much DBS invested in shared platforms despite federated delivery; pure CoE patterns without shared rails accumulate technical debt and inconsistent standards.²⁸ Sanofi has explicitly invested in an ethical framework (RAISE, Responsible AI at Sanofi for Everyone) and centralized infrastructure on Google Cloud and AWS to address this gap.³² ³⁷

A closing note on hybrids. Real organizations rarely sit in pure quadrants. JPMC’s Operating Committee elevation of the CDAO is centralized, but lines of business retain CDAOs who report into both Heitsenrether and their BU leadership, a hybrid with strong central design authority.¹ DBS combines federated delivery with strong central platforms.²⁸ The framework’s value is not categorical labeling; it is the prediction it generates about which conditions of success will be easier or harder, taken up in Part C.

Part B, Conditions of Success

Across the published record of named enterprise AI programs that have shipped, JPMC, Morgan Stanley, DBS, Walmart, Sanofi, Uber, LinkedIn, Spotify, Lyft, Netflix, Bloomberg, six conditions recur with notable consistency. They appear together; healthy programs satisfy most or all of them simultaneously. Stalled programs, including the high-profile reversals (Klarna’s customer-service automation, McDonald’s IBM drive-through, Zillow Offers’ iBuying algorithm, Air Canada’s chatbot), miss two or more.

The conditions are not a maturity curve and not a checklist of capabilities to procure. They are operational properties of a program, the kind of thing visible to anyone who walks the floor for a week and asks the right questions. The case for naming them is practical: condition names become vocabulary that program leaders and steering committees can use to localize problems quickly.

The six are: production reach, evaluation in production, integration to systems of record, governance integrated with delivery, talent retained around an applied-layer practice, and stable executive sponsorship.

Figure 3, The Six Conditions of Success

Condition	What it looks like operationally	Diagnostic signs of presence	Diagnostic signs of absence
Production reach	Shipped product on production runtime, with operational ownership and a regular cadence of new launches	Documented production model count rising; named on-call rotations; release cadence visible in changelogs	Endless POCs; demo decks substitute for working software; “in pilot” for >12 months
Evaluation in production	Continuous online evaluation, golden datasets, drift monitoring, daily regression tests	Eval framework with named metrics per use case; CSAT/quality tracking by complexity tier; drift alerts wired to on-call	“Worked in dev”; surprise failures in production; quality tracked only by aggregate volume metrics
Integration to systems of record	AI bound to identity, data pipelines, change-of-record gates in the systems where work happens	Models call and write to systems of record; AI outputs flow into CRM, ERP, ticketing, EHR	Standalone chat surface that copies/pastes into the real systems; “AI sandbox” with no production hooks
Governance integrated with delivery	Risk, model-risk-management, legal, privacy, and policy review embedded in the delivery pipeline	Governance gates in CI/CD; documented decisions per release; clear path from governance to engineering	Late-stage governance objections that kill or freeze launches; governance as a separate org with review queues
Talent retained around an applied-layer practice	Strong applied practitioners attached to a coherent practice they want to stay in	Tenure of senior applied staff >24 months; internal mobility; conference talks under the firm’s name	Hiring funnel gaps; contractor over-dependency; key-person risk; “we lost X and the program stalled”
Stable executive sponsorship	Sustained air cover from a sponsor who understands the work and can defend it through sponsor-rotation events	Sponsor on operating committee or equivalent; multi-year ROI baselines; sponsor-led portfolio reviews	Shifting goals; sponsor turnover every 12-18 months; ROI debates without baselines

9. Production reach

Production reach is the discipline that gets demos out of the demo phase. The MIT NANDA initiative’s “GenAI Divide: State of AI in Business 2025” report, based on 150 leader interviews, a 350-employee survey, and analysis of 300 public AI deployments, found that approximately 95% of enterprise generative AI pilots delivered no measurable P&L impact, with “only 5% of integrated AI pilots extract[ing] measurable profit and loss impact.”³⁸ ³⁹ The figure is survey-based and the methodology is summarized rather than fully reproduced; we report it with that caveat. The pattern it captures, the POC-to-production gap, is corroborated across many engineering retrospectives.

Production reach in the affirmative is visible. Uber’s Michelangelo runs more than 5,000 models in production at peak 10 million predictions per second.⁹ Lyft’s LyftLearn supports “thousands of production models making hundreds of millions of real-time predictions per day” in its 2025 architecture disclosure.¹⁹ Morgan Stanley’s AI @ Morgan Stanley Assistant achieved 98% adoption across advisor teams in wealth management.²³ DBS reports 1,500 models in production and time-to-deploy compressed from 18 months to 2-3 months.²⁸ These are documented figures from primary sources, not survey aggregates. JPMorganChase’s LLM Suite reaching 200,000 employees in eight months is the most extreme recent case.²⁶

The inverse pattern is also visible. McDonald’s three-year automated order-taking pilot at over 100 locations was terminated in June 2024 without a production graduation; CFO Mason Smoot’s franchisee memo cited a desire to “explore voice ordering solutions more broadly.”⁴ ⁵ The technology was real (IBM continued the platform with Wendy’s, Hardee’s, Dunkin’), but McDonald’s had no operating-model path that took the work from pilot to system-wide production. Klarna’s mid-2025 reversal, CEO Sebastian Siemiatkowski telling Bloomberg in May 2025 that the AI-only customer service had produced “lower quality” and that the company was rehiring human agents, also reflects the absence of one specific production-reach property: the absence of a graceful escalation path that prevents quality regressions on the hard edges of the distribution.⁴⁰ ⁴¹

Diagnostic signs of presence. Named on-call rotations for AI services; production model registry with non-trivial cardinality; release cadence visible in engineering blog or changelog; named system-of-record integrations.

Intervention patterns when absent. Move the program from a project structure to a product structure. Assign explicit operational ownership. Set a launch cadence and protect it. Stop asking “is the demo good?” and start asking “what shipped to whom this month?”

10. Evaluation in production

Evaluation is not a one-time gate at launch; it is a continuous practice. The defining property of programs that hold up after deployment is that they have invested in eval infrastructure with the same seriousness as model infrastructure. The forward-reference here is to Pillar 5, Governance & Evaluation, which treats this in operational depth.

Morgan Stanley’s account, published with OpenAI in 2024, is unusually candid: the team built summarization evals, then translation evals for multilingual clients, then retrieval evals as the document corpus expanded from 7,000 questions to 100,000 documents.²³ Daily regression suites, expert grading, and prompt-and-retrieval iteration with the model provider are all named. Lyft’s LyftLearn implements “model self-tests”, sample inputs and expected outputs defined in model code, run automatically on every load and continuously.⁴² Spotify’s Hendrix is “designed around traceability” specifically so production ML can be audited.¹⁸ Booking.com’s published 2019 paper on 150 ML models, based on Randomized Controlled Trials, found that offline performance gain and business value gain were essentially uncorrelated (Pearson correlation −0.1 with 90% CI of (−0.45, 0.27) across 23 model comparisons), making continuous online eval not a luxury but a requirement.⁴³

The inverse, works in dev, surprise failures in production, is captured by Zillow Offers. The November 2021 shutdown of the iBuying program, with $304 million in Q3 inventory write-downs, more than $500 million in total losses, and a 25% workforce reduction, traced in part to model performance degradation as housing-market volatility increased post-pandemic.⁴⁴ ⁴⁵ ⁴⁶ Published analyses (including the 2024 Journal of Information Systems Education case study) note that Zillow’s “Project Ketchup” prevented pricing experts from modifying algorithm valuations and asked them to stop questioning the algorithm’s outputs, a governance-and-evaluation failure layered on the algorithmic one.⁴⁷ The McDonald’s drive-through pilot’s published failure mode was similar: the system performed acceptably on average but failed on the long tail of orders, and the eval framework didn’t catch it before the failures hit social media at scale.⁶

Diagnostic signs of presence. Named eval owner per use case; CSAT or accuracy tracked by query/complexity tier (not just aggregate); drift alerts wired to on-call; documented golden datasets; regular regression-suite runs.

Intervention patterns when absent. Build an eval framework before scaling. Track metrics by complexity tier, not aggregate. Wire drift to on-call. Treat eval engineers as first-class roles, not as a contractor function.

11. Integration to systems of record

AI capability that lives only in a standalone surface, a chat box, a separate web app, rarely changes outcomes. Healthy programs bind AI into the systems where work actually happens: identity, ERP, CRM, EHR, ticketing, document management.

Morgan Stanley’s AI @ Morgan Stanley Debrief integrates directly into Salesforce, with notes saved automatically post-meeting and follow-up emails drafted in advisor inboxes, the AI surface and the system of record are the same surface.²⁵ DBS’s nudge architecture writes into the customer-facing app and to relationship-manager workflows.²⁹ LinkedIn’s Hiring Assistant sits inside recruiter workflows via a client-side SDK, with state-tracking memory and grounded data from the Talent Graph.¹⁵ Walmart’s Element platform with stateful agent-aware pipelines and tool-calling hooks is explicitly engineered for systems-of-record integration.¹⁶ ¹⁷

The inverse pattern, AI in a standalone surface that never reaches the workflow, is the dominant stall pattern in the MIT NANDA dataset. The report documents what one CEO called the “shadow AI economy” where employees use consumer ChatGPT despite firm purchases of dedicated AI tools because the dedicated tools are not bound to the systems where work happens.³⁸ Multiple 2025 surveys (BlackFog, Lenovo Work Reborn, IBM-Censuswide) consistently show 49-78% of employees using unsanctioned AI tools, a strong indirect indicator that sanctioned tools are not adequately integrated.⁴⁸ ⁴⁹ ⁵⁰ These are survey numbers, methodology varies, and we report them with that caveat; what is robust is the directional finding that integration gaps drive shadow usage.

The Air Canada chatbot case illustrates the same point with legal teeth. In Moffatt v. Air Canada (2024 BCCRT 149), the British Columbia Civil Resolution Tribunal held the airline liable for negligent misrepresentation when its chatbot promised retroactive bereavement-fare refunds that conflicted with policy on a separate webpage.⁵¹ ⁵² The bot was not integrated to the same authoritative source as the rest of the website; the tribunal found that customers should not be expected to “double-check information found in one part of its website on another part.”⁵¹ An integration-to-systems-of-record failure became a legal liability.

Diagnostic signs of presence. AI outputs flow into CRM, ERP, EHR, ticketing without copy-paste; identity is shared with the system of record; change-of-record gates exist where required.

Intervention patterns when absent. Stop building standalone surfaces. Make the next AI capability extend an existing system. Demand SSO and shared identity from day one.

12. Governance integrated with delivery

Governance integrated with delivery means risk, MRM, legal, privacy, and compliance review embedded in the delivery pipeline rather than appended at the end. This forward-references Pillar 5, Governance & Evaluation for operational depth.

The EU AI Act’s high-risk obligations under Articles 9-17 (provider) and Article 26 (deployer), currently scheduled for binding enforcement on August 2, 2026, though a November 2025 Digital Omnibus proposal would delay Annex III compliance to December 2027, make this a regulatory reality, not a stylistic choice.⁵³ ⁵⁴ Organizations operating high-risk AI systems in regulated EU sectors face documented compliance investments; one Cloud Security Alliance research note estimates initial implementation in the $8-15 million range for large enterprises with $1-5 million in annual ongoing cost, which is an order-of-magnitude figure rather than a precise benchmark.⁵⁴

DBS’s RAISE-equivalent, its PURE framework (Purposeful, Unsurprising, Respectful, Explainable) embedded in every model deployment, is governance integrated with delivery rather than appended.⁵⁵ Sanofi’s RAISE (Responsible AI at Sanofi for Everyone) is positioned as a strategic asset, not a compliance overlay.³² Spotify’s Hendrix was deliberately designed with traceability “at its core” so production audits would be tractable.¹⁸ Capital One’s “sloped governance”, calibrated to data sensitivity, is a federated answer to the same problem.³⁰

The inverse, late-stage governance objections that kill the program, is the most common stall pattern documented in trade press. Morgan Stanley’s published account is illuminating in the affirmative: the firm’s eval framework explicitly integrates compliance regression suites, and OpenAI’s zero-data-retention policy was a precondition for the partnership, not an afterthought.²³ The Klarna reversal is a partial case in this category: not a governance veto, but a quality-and-brand judgment that arrived after, not during, the AI-replaces-humans rollout.⁴¹

Diagnostic signs of presence. Governance gates in CI/CD; documented sign-offs per release; clear DRIs from legal/MRM/privacy attached to the delivery team, not separate.

Intervention patterns when absent. Move governance staff into the delivery org as embedded members. Build governance gates as automated pipeline steps. Track governance review time as a release metric.

13. Talent retained around an applied-layer practice

The applied AI talent market, by multiple market-rate sources, runs at roughly $200,000-$355,000 average compensation for senior individual contributors in U.S. markets in 2025-2026, with documented year-over-year growth of 38% and demand-supply ratios in the 3:1 range.⁵⁶ ⁵⁷ These are private compensation-data sources with varying methodologies; what is reliable is the direction: the market is hot, retention matters, and contractor over-dependency is fragile.

Programs that retain talent share a pattern. They build a practice, a community of named senior practitioners who publish, speak, and mentor. Uber’s Michelangelo team published the seminal 2017 platform paper and a 2024 evolution piece authored by Kai Wang, group product manager.⁹ ⁵⁸ Lyft’s engineering blog publishes architecture decisions and post-mortems under named authors (Yaroslav Yatsiuk, Vinay Kakade, Konstantin Gizdarski).¹⁹ ⁵⁹ ⁴² Spotify’s Hendrix team, Mike Seid (tech lead), Divita Vohra (senior PM), David Xia (senior engineer), is a named, public-facing identity.¹³ ⁶⁰ Netflix’s Metaflow team, Ville Tuulos and successors, became the basis of a venture-backed company, Outerbounds, while continuing to support the Netflix internal platform.¹¹ ¹² DBS’s “data chapter” is explicitly designed to provide belonging and skilling, with 18,000+ employees trained in data skills and 2,000 designated as advanced practitioners.⁶¹

The inverse, hiring funnel gaps, contractor over-dependency, key-person risk, is harder to document directly because firms do not publish their own attrition rates by function. The structural risk shows up indirectly: programs that disappear from the engineering blog for 18 months tend to be programs whose senior IC layer turned over. Stanford HAI’s AI Index 2025 reports U.S. private AI investment at $109.1 billion in 2024, with the AI talent market continuing to favor candidates over employers; retention is a survival skill, not an HR optimization.⁶²

Diagnostic signs of presence. Tenure of senior applied staff >24 months; named external publishing; conference talks under the firm’s name; internal mobility into and out of the AI org.

Intervention patterns when absent. Build a practice, not a roster. Invest in publication, conference talks, and visible technical leadership. Pay market. Convert key contractors to FTE.

14. Stable executive sponsorship

The IBM 2025 enterprise study (covering 2,300 organizations) found that 26% had a Chief AI Officer or equivalent, up from 11% two years earlier, and that organizations with a named senior AI executive reported approximately 10% higher self-reported ROI on AI investments.⁶³ As survey-based finding, we flag the methodology limitation; the directional pattern is consistent across multiple 2025 surveys.⁶³

What stable sponsorship looks like operationally is more specific than a title. JPMC’s CDAO sits on the Operating Committee, reports to Dimon and Pinto, and has been in the role since the structure was created in 2023.¹ ³ DBS’s CEO Piyush Gupta personally championed AI from 2014 forward, using the GANDALF mnemonic and the “1,000 experiments” KPI; current CEO Tan Su Shan continued the agenda after taking over in April 2025.²⁸ Goldman Sachs’ AI strategy is owned by CIO Marco Argenti, with explicit CEO sponsorship from David Solomon and a published AI Exchanges podcast cadence that signals ongoing executive engagement.³⁴ ³⁵ These are multi-year sponsorship runs, not headline appointments.

The inverse, shifting goals, sponsor turnover, ROI debates without baselines, is documented less in primary sources than in trade press. The Klarna case is partly instructive here: Siemiatkowski’s December 2024 statements (“AI can already do all of the jobs that we, as humans, do”) and his May 2025 statements to Bloomberg (“investing in the quality of human support is the way of the future for us”) differ markedly across five months, sponsor narrative volatility, executed in public.⁴¹ ⁶⁴ At Zillow, CEO Rich Barton’s November 2021 acknowledgement of algorithmic-pricing failure as a primary cause of the iBuying shutdown was sponsorship of a different kind, accountability rather than continuity.⁴⁵

Diagnostic signs of presence. Named senior sponsor with multi-year tenure; sponsor on operating committee; multi-year ROI baselines; sponsor visibility at portfolio reviews.

Intervention patterns when absent. Build the ROI baseline early, before debates start. Track sponsor turnover as program risk. Develop a deputy who can absorb sponsor rotation.

Part C, Synthesis

15. How operating model and success conditions interact

Each archetype tends to make some conditions easier and others harder. The interactions are predictable enough to guide intervention. Figure 4, the synthesis figure of this report, captures the pattern.

Figure 4, Operating-Model × Success-Condition Interaction Matrix

Archetype	Production reach	Evaluation in production	Integration to SoR	Governance integrated	Talent retention	Executive sponsorship
Centralized Platform (Uber, Netflix, Walmart, LinkedIn, Spotify)	Easier, platform amortizes integration cost	Easier, eval baked into platform stages	Typical, depends on platform integration choices	Easier, governance gates in pipeline	Harder, platform engineers highly portable	Typical, depends on platform sponsor durability
Centralized Delivery (Bloomberg, Morgan Stanley, JPMC LLM Suite)	Easier, single accountable team ships flagship	Easier, eval discipline travels with product team	Easier, flagship designed against named SoR	Typical, depends on whether governance is staffed inside team	Typical, strong if practice is built; risky if all centered on one product	Easier, flagship product creates clear C-suite line
Federated Platform (DBS, Capital One, Sanofi M&S)	Typical, depends on platform-team velocity	Harder, eval standards drift across BUs	Easier, BU teams sit close to SoR	Harder, central governance enforces what it doesn’t deliver	Easier, proximity to domain plus central skilling	Typical, depends on central function authority
Federated Delivery / CoE (Goldman GS AI, Sanofi Research, many pharma/insurance)	Harder, coordination cost; pattern reuse weak	Harder, eval inconsistent across BUs	Easier, BU teams own SoR	Harder, governance often appended late	Easier, CoE community + domain proximity	Harder, distributed accountability dilutes sponsor signal

Three predictions follow from the matrix.

Centralized Platform tends to achieve production reach and governance integration but struggles with talent retention. The pattern is observable across Uber, Netflix, Spotify, and LinkedIn, strong production scale and governance discipline, recurring senior-IC turnover that has prompted multiple platform rebuilds. The intervention: invest disproportionately in the practice (publication, conference visibility, named internal leadership) to slow attrition.

Federated Delivery / CoE tends to achieve talent retention and stakeholder alignment but struggles with governance integration and production reach. Observable across many pharma and insurance CoEs, and partially in Goldman’s federated delivery model. The intervention: stand up shared evaluation and governance rails, converting the model toward Federated Platform, without absorbing delivery into the center.

Centralized Delivery is the fastest archetype to a flagship surface but the slowest to broaden coverage. Bloomberg’s BloombergGPT and Morgan Stanley’s wealth-management suite are the cleanest examples; the strategic risk is obsolescence by frontier movement (BloombergGPT was matched by GPT-4 within months of release).²² The intervention: deliberately seed platform capability inside the central team as the second-wave investment.

Federated Platform is the most balanced but requires the highest central authority. DBS’s documented 1,500-model library and 750M-Singapore-dollar 2024 economic value sit on a decade of central data-platform investment that the firm’s leadership protected through CEO transitions.²⁸ ²⁹ Without the central authority to enforce standards, Federated Platform decays toward Federated Delivery.

16. Diagnostic checklist

The following checklist gives an AI program leader 12 questions to ask at month 3, month 6, and month 12 of a program (or an outside reviewer 12 questions to ask of a program at any point in its life). Each question has a healthy and a stalling answer pattern.

Figure 5, Diagnostic Checklist by Program Month

Month	Question	Healthy answer	Stalling answer
M3	Who is the named DRI for this AI capability in production?	One named person; on-call rotation defined	“The team”; rotation undefined
M3	Where does the eval framework live, and who owns it?	Named eval owner; framework in repo with daily runs	“We measure CSAT”; no per-tier metrics
M3	Which system of record will this AI write to?	Named SoR with integration plan and identity story	“We’re starting with a chat surface”
M3	Who from governance/MRM/legal is embedded in the delivery team?	Named individual attending delivery standups	“We’ll loop them in for review”
M6	What shipped this month?	Concrete release notes; user-visible change	“Demos, working through next pilot”
M6	What’s the production model count and how is it changing?	Numbered, growing; retirement discipline visible	Unknown or static
M6	What’s the tenure of the senior applied IC group?	>18 months on average; internal hires visible	High contractor ratio; recent senior departures
M6	What ROI baseline was set at month 0?	Documented baseline; current vs. baseline tracked	“We’re working on the metrics”
M12	What did production tell you that dev didn’t?	Named drift events; eval-driven rollback; named lessons	“Production looks fine”
M12	Has the program survived a sponsor-rotation event?	Yes, sponsor change without program disruption	Hasn’t been tested; sponsor still in place but at risk
M12	Have you had to push back governance?	Embedded governance, no late-stage vetoes	One or more late-stage governance kills
M12	What’s the second-wave investment?	Platform/practice extension to next domain	Still defending first wave

17. Choosing and transitioning between archetypes

The four archetypes are not a maturity ladder. They are answers to different constraints. Choice depends on five questions.

How similar are your AI use cases at the infrastructure layer? High similarity favors platform-led; low similarity favors delivery-led.
How concentrated is your competitive advantage? A small number of high-stakes flagship surfaces favors Centralized Delivery; a broad portfolio favors platform-led models.
How much domain difference exists across business units? High domain difference (regulatory, product, geographic) favors federated models.
How strong is central authority in your firm’s culture? Weak central authority cannot run Federated Platform; that pattern decays to Federated Delivery without enforcement.
What’s your governance regime? Heavy regulation (financial services, healthcare, EU jurisdictions under the AI Act) raises the value of platform-led governance integration.

Figure 6, Common Transition Paths

                     [Centralized Delivery]
                       (flagship-first)
                              │
                              │ ① add platform capability
                              ▼
                     [Centralized Platform]
                       (shared rails for many use cases)
                              │
                              │ ② push delivery to BUs while keeping rails
                              ▼
                     [Federated Platform]
                       (BU delivery, central platform/governance)
                              │
                              │ ③ central authority weakens; standards drift
                              ▼
                     [Federated Delivery / CoE]
                       (BU delivery, CoE advisory only)
                              │
                              │ ④ recentralization after stall or incident
                              ▼
                     [Centralized Platform or Delivery]

Three observations on transitions.

Path ① (Centralized Delivery → Centralized Platform) is common and well-supported by the public record. Uber began Michelangelo as a central effort to support specific UberEATS and rides use cases and evolved into a general platform.⁹ ⁶⁵ Walmart’s Element followed a similar path. The cost is platform-team standup and an internal-customer model.

Path ② (Centralized Platform → Federated Platform) is the highest-leverage transition for firms with growing scale. DBS executed this transition deliberately over a decade, embedding 700 data professionals in business units while keeping platforms central.²⁸ The cost is central authority maintenance, without it, the model decays.

Path ③ (Federated Platform → Federated Delivery / CoE) is the silent failure mode. Standards drift, BU forks proliferate, governance gets appended late. Recentralization (Path ④) is then forced by an incident, Zillow-style, or by regulatory pressure (the EU AI Act compliance cycle through 2026-2027 is currently driving this in several large EU-operating firms).⁵³ ⁵⁴

The cost of moving is real. Lyft’s hybrid SageMaker/Kubernetes architecture in 2025 was explicitly framed as “replacing the execution engine while keeping ML workflows completely unchanged”, a model for transition without disruption.¹⁹ Most transitions are not so clean. The default counsel is: stay where you are unless a named condition of success is failing, and when you move, move toward the archetype that makes that specific condition easier.

18. Methodology and sources

This report is based on primary published sources from named enterprises (annual reports, SEC filings, engineering blogs under named authors, on-the-record executive interviews, peer-reviewed and arXiv-published technical papers), peer-reviewed legal commentary, and reputable trade press used as context only.

We applied four evidence tiers to every claim. Tier 1 (documented public fact) requires direct citation to a primary source. Tier 2 (industry-attested pattern) requires at least two independent named sources. Tier 3 (reasoned inference) is flagged in-text (“the pattern suggests”). Tier 4 (editorial judgment) is rare and explicitly attributed.

Three limitations matter. First, primary sources skew toward firms that publish; quieter operating models with strong outcomes are underrepresented. Second, survey-based findings (MIT NANDA’s 95% pilot stall figure; Stanford HAI’s 78% organizational adoption rate; multiple shadow-AI surveys) are flagged where used; methodology varies and we have not independently audited the underlying datasets. Third, the Klarna case sits in an unusual position, the company’s narrative shifted within a 12-month window, and we have privileged the May 2025 Bloomberg interview and CEO admission over the February 2024 launch press release.⁴⁰ ⁴¹ ⁶⁶

The report does not rely on McKinsey, BCG, Bain operating-model frameworks as load-bearing evidence; cited consultancy material (for instance, the McKinsey case study of DBS) is treated as context rather than authority, and its claims are corroborated by primary DBS sources where they appear in the report. Vendor “thought leadership” decks were excluded except where the vendor was a named partner in a documented engagement (OpenAI’s Klarna and Morgan Stanley case studies are used as evidence of the engagements, not as evidence of vendor-claimed outcomes; the Klarna case study has been substantially superseded by the company’s own subsequent reversal).

Cross-references: Pillar 1 establishes the manifesto for the applied layer and the organizational dimension this report builds on. Pillar 2 takes up architecture choices that interact with operating-model archetype. Pillar 4 develops platform-choice and vendor criteria as a function of archetype. Pillar 5 takes governance integration and evaluation as operational practice in depth.

TL;DR

Operating model dominates technology choice as the determinant of enterprise AI outcomes. Two firms with comparable ambition, talent, and vendor stacks produce divergent results because their operating models, design authority, build capacity, governance regime, run model, funding flow, differ. JPMC, DBS, Morgan Stanley, Sanofi and Walmart show what happens when these are coherent; McDonald’s, Air Canada, Zillow, and Klarna’s customer-service automation show what happens when they are not.
The population of operating models clusters into four archetypes on a 2x2 grid. Centralized Platform (Uber, Netflix, Walmart, LinkedIn, Spotify), Centralized Delivery (Bloomberg, Morgan Stanley, JPMC LLM Suite), Federated Platform (DBS, Capital One, Sanofi M&S), and Federated Delivery / CoE (Goldman GS AI, Sanofi Research, much of pharma and insurance).
Six conditions of success account for most of the variance among healthy programs: production reach, evaluation in production, integration to systems of record, governance integrated with delivery, talent retained around an applied-layer practice, and stable executive sponsorship. Each archetype makes some easier and some harder; the interactions are predictable enough to guide intervention, summarized in Figure 4, the synthesis figure of this report.

JPMorganChase, “Teresa Heitsenrether, Leadership,” accessed May 1, 2026, https://www.jpmorganchase.com/about/leadership/teresa-heitsenrether. ↩↩↩
Penny Crosman, “How JPMorganChase democratized employee access to gen AI,” American Banker, May 2025, accessed May 1, 2026, https://www.americanbanker.com/news/how-jpmorganchase-democratized-employee-access-to-gen-ai. ↩↩↩
JPMorgan Chase & Co., “Form ARS/A, FY2023 Annual Report,” U.S. Securities and Exchange Commission, filed 2024, https://www.sec.gov/Archives/edgar/data/0000019617/000001961724000286/annualreport-2023.pdf. ↩↩↩
Jonathan Maze, “McDonald’s is ending its drive-thru AI test,” Restaurant Business, June 14, 2024, accessed May 1, 2026, https://www.restaurantbusinessonline.com/technology/mcdonalds-ending-its-drive-thru-ai-test. ↩↩↩
Amelia Lucas, “McDonald’s to end AI drive-thru test with IBM,” CNBC, June 17, 2024, https://www.cnbc.com/2024/06/17/mcdonalds-to-end-ibm-ai-drive-thru-test.html. ↩↩↩
Connie Lin, “Bacon ice cream: McDonald’s AI drive-thru ordering full of glitches,” Fast Company, June 18, 2024, https://www.fastcompany.com/91142882/mcdonalds-ai-drive-thru-ordering-glitches. ↩↩
Jeanne W. Ross, Peter Weill, and David C. Robertson, Enterprise Architecture as Strategy: Creating a Foundation for Business Execution (Boston: Harvard Business Review Press, 2006). ↩
The Open Group, “TOGAF Standard, 10th Edition,” accessed May 1, 2026. ↩
Kai Wang, et al., “From Predictive to Generative, How Michelangelo Accelerates Uber’s AI Journey,” Uber Engineering Blog, 2024, https://www.uber.com/blog/from-predictive-to-generative-ai/. ↩↩↩↩↩
Jeremy Hermann and Mike Del Balso, “Meet Michelangelo: Uber’s Machine Learning Platform,” Uber Engineering Blog, September 2017, https://www.uber.com/blog/michelangelo-machine-learning-platform/. ↩
Netflix, “Metaflow GitHub repository, README,” accessed May 1, 2026, https://github.com/Netflix/metaflow. ↩↩
David J. Berg et al., “Supporting Diverse ML Systems at Netflix,” Netflix Tech Blog, March 2024, https://netflixtechblog.com/supporting-diverse-ml-systems-at-netflix-2d2e6b6d205d. ↩↩
Divita Vohra and Mike Seid, “Introducing the Hendrix ML Platform: An Evolution of Spotify’s ML Infrastructure,” QCon New York 2023, https://qconnewyork.com/presentation/jun2023/introducing-hendrix-ml-platform-evolution-spotifys-ml-infrastructure. ↩↩
Karthik Ramgopal et al., “The LinkedIn Generative AI Application Tech Stack: Extending to Build AI Agents,” LinkedIn Engineering Blog, accessed May 1, 2026, https://www.linkedin.com/blog/engineering/generative-ai/the-linkedin-generative-ai-application-tech-stack-extending-to-build-ai-agents. ↩↩
LinkedIn Engineering, “How we engineered LinkedIn’s Hiring Assistant,” LinkedIn Engineering Blog, accessed May 1, 2026, https://www.linkedin.com/blog/engineering/ai/how-we-engineered-linkedins-hiring-assistant. ↩↩
Walmart Global Tech, “From models to agents: A new era of intelligent systems at Walmart,” accessed May 1, 2026, https://public.walmart.com/content/walmart-global-tech/en_us/blog/post/wibey-announcement.html. ↩↩
Sharon Goldman, “Walmart’s CTO places bigger bets on generative AI as customer shopping habits evolve,” Fortune, October 16, 2024, https://fortune.com/2024/10/16/walmart-cto-shopping-ai/. ↩↩
Spotify Engineering, “Empowering Traceable and Auditable ML in Production at Spotify with Hendrix,” MLconf 2022, https://mlconf.com/sessions/empowering-traceable-and-auditable-ml-in-production-at-spotify-with-hendrix/. ↩↩↩
Yaroslav Yatsiuk, “LyftLearn Evolution: Rethinking ML Platform Architecture,” Lyft Engineering, December 2025, https://eng.lyft.com/lyftlearn-evolution-rethinking-ml-platform-architecture-547de6c950e1. ↩↩↩↩
Pinterest Engineering, “ML Platform retrospective 2014-2025,” 2025. ↩
Bloomberg LP, “Introducing BloombergGPT,” March 30, 2023, https://www.bloomberg.com/company/press/bloomberggpt-50-billion-parameter-llm-tuned-finance/. ↩
Shijie Wu et al., “BloombergGPT: A Large Language Model for Finance,” arXiv:2303.17564 (revised December 2023), https://arxiv.org/abs/2303.17564. ↩↩↩
OpenAI, “Morgan Stanley uses AI evals to shape the future of financial services,” 2024, https://openai.com/index/morgan-stanley/. ↩↩↩↩↩
Hugh Son, “Morgan Stanley kicks off generative AI era on Wall Street with assistant for financial advisors,” CNBC, September 18, 2023, https://www.cnbc.com/2023/09/18/morgan-stanley-chatgpt-financial-advisors.html. ↩
Morgan Stanley, “Launch of AI @ Morgan Stanley Debrief,” press release, June 26, 2024, https://www.morganstanley.com/press-releases/ai-at-morgan-stanley-debrief-launch. ↩↩
JPMorganChase, “LLM Suite named 2025 ‘Innovation of the Year’ by American Banker,” accessed May 1, 2026, https://www.jpmorganchase.com/about/technology/blog/llmsuite-ab-award. ↩↩
Singapore Economic Development Board, “How DBS, Southeast Asia’s largest bank, is capturing the full value of AI and Machine Learning in Singapore,” accessed May 1, 2026, https://www.edb.gov.sg/en/business-insights/insights/how-dbs-southeast-asias-largest-bank-is-capturing-the-full-value-of-ai-and-machine-learning-in-singapore.html. ↩
McKinsey & Company, “DBS CEO Tan Su Shan on building a gen AI-enabled bank with a heart,” February 2026, https://www.mckinsey.com/featured-insights/future-of-asia/dbs-ceo-tan-su-shan-on-building-a-gen-ai-enabled-bank-with-a-heart. ↩↩↩↩↩↩↩↩
DBS Bank, “AI-Powered Personalised Nudges & Investment Ideas,” accessed May 1, 2026, https://www.dbs.com/artificial-intelligence-machine-learning/index.html. ↩↩↩↩↩
Capital One Tech, “Accelerating AI with Data Management,” accessed May 1, 2026, https://www.capitalone.com/tech/ai/data-management/. ↩↩↩
Sanofi, “A New Era of Biopharma Production with AI,” accessed May 1, 2026, https://www.sanofi.com/en/magazine/our-science/a-new-era-of-biopharma-production-with-ai. ↩↩
Sanofi, “Digital Transformation and Artificial Intelligence,” accessed May 1, 2026, https://www.sanofi.com/en/our-science/digital-artificial-intelligence. ↩↩↩↩
AWS, “Sanofi Drives Scientific Innovation with AI on AWS for Real-Life Patient Impact,” accessed May 1, 2026, https://aws.amazon.com/solutions/case-studies/sanofi-case-study/. ↩
Goldman Sachs, “Marco Argenti, Leadership Profile,” accessed May 1, 2026, https://www.goldmansachs.com/our-firm/our-people-and-leadership/leadership/management-committee/marco-argenti. ↩↩
Goldman Sachs, “AI Exchanges: CIO Marco Argenti on the future of AI in the workplace,” March 2025, https://www.goldmansachs.com/insights/goldman-sachs-exchanges/ai-exchanges-cio-marco-argenti-on-the-future-of-ai-in-the-workplace. ↩↩
AWS Machine Learning Blog, “Establishing an AI/ML center of excellence,” accessed May 1, 2026, https://aws.amazon.com/blogs/machine-learning/establishing-an-ai-ml-center-of-excellence/. ↩
CIO magazine, “Optimizing patient care at Sanofi through AI,” 2025, https://www.cio.com/article/4049083/sanofi-applies-ai-to-deliver-more-effective-and-accessible-medicines-to-patients.html. ↩
Sheryl Estrada, “MIT report: 95% of generative AI pilots at companies are failing,” Fortune, August 18, 2025, https://fortune.com/2025/08/18/mit-report-95-percent-generative-ai-pilots-at-companies-failing-cfo/. ↩↩
MIT NANDA, “The GenAI Divide: State of AI in Business 2025,” 2025. ↩
Bloomberg, “Klarna Turns From AI to Real Person Customer Service,” May 8, 2025, https://www.bloomberg.com/news/articles/2025-05-08/klarna-turns-from-ai-to-real-person-customer-service. ↩↩
Sherin Shibu, “Klarna Is Hiring Customer Service Agents After AI Couldn’t Cut It on Calls,” Entrepreneur, May 9, 2025, https://www.entrepreneur.com/business-news/klarna-ceo-reverses-course-by-hiring-more-humans-not-ai/491396. ↩↩↩↩
Vinay Kakade and Shiraz Zaman, “LyftLearn: ML Model Training Infrastructure built on Kubernetes,” Lyft Engineering, https://eng.lyft.com/lyftlearn-ml-model-training-infrastructure-built-on-kubernetes-aef8218842bb. ↩↩
Lucas Bernardi et al., “150 Successful Machine Learning Models: 6 Lessons Learned at Booking.com,” KDD ‘19, ACM SIGKDD Conference 2019, https://dl.acm.org/doi/10.1145/3292500.3330744. ↩
Zillow Group, Form 8-K, November 2, 2021. ↩
Patrick Clark and Noah Buhayar, “Zillow shutdown of Zillow Offers,” Bloomberg News, November 2021. ↩↩
Cynthia Calongne et al., “Exploring the Role of AI in the Closure of Zillow Offers,” Journal of Information Systems Education 35, no. 1 (2024): 67-72, https://jise.org/Volume35/n1/JISE2024v35n1pp67-72.pdf. ↩
Ibid. ↩
BlackFog, “Shadow AI Threat Grows Inside Enterprises,” January 27, 2026, https://www.blackfog.com/blackfog-research-shadow-ai-threat-grows/. ↩
Lenovo, “Work Reborn Research Series 2026,” 2026. ↩
IBM, “Is rising AI adoption creating shadow AI risks?” accessed May 1, 2026, https://www.ibm.com/think/insights/rising-ai-adoption-creating-shadow-risks. ↩
Moffatt v. Air Canada, 2024 BCCRT 149 (British Columbia Civil Resolution Tribunal, February 14, 2024). ↩↩
Lisa R. Lifshitz and Roland Hung, “BC Tribunal Confirms Companies Remain Liable for Information Provided by AI Chatbot,” American Bar Association Business Law Today, February 2024, https://www.americanbar.org/groups/business_law/resources/business-law-today/2024-february/bc-tribunal-confirms-companies-remain-liable-information-provided-ai-chatbot/. ↩
European Commission, “AI Act,” accessed May 1, 2026, https://digital-strategy.ec.europa.eu/en/policies/regulatory-framework-ai. ↩↩
Cloud Security Alliance, “EU AI Act High-Risk Deadline: Enterprise Readiness Gap,” March 13, 2026, https://labs.cloudsecurityalliance.org/research/csa-research-note-eu-ai-act-high-risk-compliance-deadline-20/. ↩↩↩
DBS Bank, “DBS’ AI-Powered Digital Transformation,” accessed May 1, 2026, https://www.dbs.com/artificial-intelligence-machine-learning/artificial-intelligence/dbs-ai-powered-digital-transformation.html. ↩
Pave, “2025 AI & ML Compensation Trends & Practices Report,” 2025, https://www.pave.com/blog-posts/ai-ml-talent-insights-4-key-takeaways-from-our-2025-report. ↩
KORE1, “How to Hire AI Engineers in 2026: Complete Staffing Guide,” 2026. ↩
Kai Wang, “ML Platform at Uber, Past, Present, and Future,” MLOps Community, https://home.mlops.community/public/videos/ml-platform-at-uber-past-present-and-future. ↩
Konstantin Gizdarski, “Building Real-time Machine Learning Foundations at Lyft,” Lyft Engineering, https://eng.lyft.com/building-real-time-machine-learning-foundations-at-lyft-6dd99b385a4e. ↩
David Xia and Keshi Dai, “How Spotify Built a Robust Ray Platform with a Frictionless Developer Experience,” Anyscale 2024, https://www.anyscale.com/blog/how-spotify-built-a-robust-ray-platform-with-a-frictionless-developer. ↩
Thomas H. Davenport and Randy Bean, “Portrait of an AI Leader: Piyush Gupta of DBS Bank,” MIT Sloan Management Review, 2022, https://sloanreview.mit.edu/article/portrait-of-an-ai-leader-piyush-gupta-of-dbs-bank/. ↩
Stanford HAI, “AI Index Report 2025,” 2025, https://hai.stanford.edu/ai-index/2025-ai-index-report. ↩
Jeff Winter, “The Rise of the CAIO (Chief AI Officer),” Jeff Winter Insights, accessed May 1, 2026, https://www.jeffwinterinsights.com/insights/the-chief-ai-officer-role. ↩↩
Sasha Rogelberg, “Klarna’s CEO warns AI will replace human workers,” Fortune, February 3, 2025, https://fortune.com/2025/02/03/klarna-ceo-ai-replacing-human-workers/. ↩
Mike Del Balso et al., “Scaling Machine Learning at Uber with Michelangelo,” Uber Engineering Blog, https://www.uber.com/us/en/blog/scaling-michelangelo/. ↩
Klarna, “Klarna AI assistant handles two-thirds of customer service chats in its first month,” press release, February 27, 2024, https://www.klarna.com/international/press/klarna-ai-assistant-handles-two-thirds-of-customer-service-chats-in-its-first-month/. ↩

Was this useful?

2 May 2026Briefing

Membership

Become a Member to receive new research as they are published.

Operating Models & What Success Looks Like

Executive Summary

Part A, Operating-Model Archetypes

1. Why operating model is the dominant variable

2. What “operating model” means here

3. The 2x2 framework

Figure 1, The Operating-Model 2x2

4. Centralized Platform

5. Centralized Delivery

6. Federated Platform

7. Federated Delivery / CoE

Part B, Conditions of Success

Figure 3, The Six Conditions of Success

9. Production reach

10. Evaluation in production

11. Integration to systems of record

12. Governance integrated with delivery

13. Talent retained around an applied-layer practice

14. Stable executive sponsorship

Part C, Synthesis

15. How operating model and success conditions interact

Figure 4, Operating-Model × Success-Condition Interaction Matrix

16. Diagnostic checklist

Figure 5, Diagnostic Checklist by Program Month

17. Choosing and transitioning between archetypes

Figure 6, Common Transition Paths

18. Methodology and sources

TL;DR

Was this useful?

Related

Evaluation and Governance

Operating Models and What Success Looks Like

Trust in Enterprise AI: Evaluation as Practice, Governance as Delivery

Membership

Executive Summary

Part A, Operating-Model Archetypes

1. Why operating model is the dominant variable

2. What “operating model” means here

3. The 2x2 framework

Figure 1, The Operating-Model 2x2

4. Centralized Platform

5. Centralized Delivery

6. Federated Platform

7. Federated Delivery / CoE

Part B, Conditions of Success

8. What healthy AI programs share

Figure 3, The Six Conditions of Success

9. Production reach

10. Evaluation in production

11. Integration to systems of record

12. Governance integrated with delivery

13. Talent retained around an applied-layer practice

14. Stable executive sponsorship

Part C, Synthesis

15. How operating model and success conditions interact

Figure 4, Operating-Model × Success-Condition Interaction Matrix

16. Diagnostic checklist

Figure 5, Diagnostic Checklist by Program Month

17. Choosing and transitioning between archetypes

Figure 6, Common Transition Paths

18. Methodology and sources

TL;DR

Was this useful?

Related

Evaluation and Governance

Operating Models and What Success Looks Like

Trust in Enterprise AI: Evaluation as Practice, Governance as Delivery

Membership