Engine, Track, Driver, Payload — a frame for enterprise GenAI

Most enterprise leaders I talk to have moved past whether to use GenAI. The question now is how to scale it — across many business units, across regulated data, across teams that have very different appetites for risk.

That shift is harder than it sounds. A demo that wows the leadership team is one thing. A GenAI system running inside underwriting, customer support, or industrial operations is something else entirely. The failure modes are no longer "embarrassing chatbot." They are regulatory exposure, mispriced risk, and a recommendation acted on by an operator with real-world consequences.

To make that scale-up tractable, I've started using a four-part mental model with the executives I work with: Engine, Track, Driver, Payload. Borrowed from how heavy industry thinks about moving things safely. It maps cleanly to the four things every serious enterprise AI program has to get right — and the four ways most of them quietly fail.

1. The Engine — your proprietary data

Every enterprise sits on a moat that has nothing to do with AI: decades of historical records, transactional documents, regulatory correspondence, operational telemetry, customer interactions. Most of it lives in legacy systems, regional silos, and PDFs nobody can search.

That data is the engine. Public foundation models — Claude, GPT, Gemini — are powerful, but the unique value of your enterprise AI is never the model itself. It's what the model knows about you: your customers, your contracts, your operations, your past decisions.

The first investment isn't a model. It's getting that data consolidated onto a unified, governed layer where it can actually be retrieved, secured, and audited. Until that's done, you don't have an engine. You have data exhaust.

2. The Track — alignment and architecture

A powerful engine without a track is a liability. Public foundation models are trained to predict plausible text from the public internet. They don't, on their own, understand your regulatory obligations, your tolerance for error, or which systems should never be touched without a human in the loop.

Alignment is the engineering work that forces a general-purpose model to behave inside your specific rules. In practice, that's three things.

Separate generation from calculation. Let the model do natural-language work — routing questions, drafting summaries, explaining decisions. Don't let it do the math. Pricing, risk scoring, financial forecasting — those should run on your existing deterministic systems, with the LLM acting as an interface, not the calculator.

Build clean data boundaries. In a multi-business enterprise, in M&A, or in any environment where lines of business have to stay logically separated, your AI architecture needs the same property. Permissions, embeddings, and retrieval scopes have to be partitioned per business unit and detachable on demand. If you can't divest an asset cleanly because its data has bled into a shared AI layer, you've created a problem you didn't have before.

Keep AI out of critical execution loops. For anything safety-critical, regulated, or otherwise high-consequence, AI sits in an advisory layer. It explains, recommends, drafts, summarizes — humans approve and execute. The cost of getting this boundary wrong is much higher than the cost of slower throughput.

3. The Driver — the people running the system

A modern engine on a perfect track still needs a driver who knows what they're doing. This is the part most enterprise AI programs underinvest in.

The frontline people who currently do the work — underwriters, asset managers, operations leads, support agents — don't disappear when AI shows up. Their job changes. They go from doing the work to supervising and validating a system that produces drafts and recommendations much faster than they could on their own.

That's a real skills shift. The good news: it's mostly trainable. The bad news: nobody trains for it accidentally. If you deploy AI without upskilling the people who are supposed to oversee it, you end up with either silent rubber-stamping — the AI is trusted too much — or quiet refusal, where the AI is ignored and the investment never lands.

“The single most important hire in an enterprise AI program is usually not a researcher. It's the person who designs the operating model around the AI — who decides what humans review, when, and how.”

4. The Payload — the business outcome

The first three pieces exist for one reason: to move the payload. In enterprise contexts, the payload is rarely "we shipped an AI feature." It's faster underwriting cycles, fewer compliance exceptions, lower cost-to-serve, shorter time-to-decision in operations.

This is the part executives need to define before anything else gets built. What business outcome are we measuring? On what timeline? Against what baseline?

Without those answers up front, you end up with an impressive AI capability that floats free of the P&L, gets quietly defunded after eighteen months, and shows up in the next strategy deck as a lesson learned.

The synthesis

Most enterprise GenAI programs I've seen fail in one of four ways. They underbuild the data foundation and ship demos that don't generalize. They underbuild alignment and create regulatory or reputational exposure. They underbuild the human operating model and the technology gets quietly ignored. Or they never decide what business outcome they're actually trying to move.

The engine-track-driver-payload frame doesn't make any of this easier. It just makes it harder to skip a step. Each piece is necessary. None is sufficient on its own.

If you're a leader trying to scale GenAI across a complex enterprise right now, the most useful question to keep asking your team is not "can we do this with AI?" — it's "which of the four are we currently the weakest on, and what would it cost to fix that this quarter?"

That's the conversation that moves enterprise AI from a slide deck to an operating model.