Agentic workflows: what actually changes in the enterprise.

"Agentic" is doing a lot of work in 2026 marketing. Most clients we speak to cannot cleanly distinguish an agent from a chain-of-prompts from a traditional workflow engine with an LLM node. This article is the definition we use internally, and the handful of places we have actually seen agentic patterns earn their keep in enterprise.

A working definition

An agentic workflow is one where a language model decides, at run-time, which tool to invoke, in what order, against what inputs, observing the result and deciding whether to continue, retry or stop. The distinguishing quality is dynamic control flow. A pipeline where an LLM reads a ticket, classifies it, and is followed by a fixed set of steps is not agentic; it is a pipeline with an LLM in it. Both can be useful; only one warrants the vocabulary.

Where agents earn their keep

1. Long-tail data extraction

Structured extraction from heterogeneous documents — invoices with wildly varying layouts, planning applications from 300 different templates, medical records from decades of different systems. A fixed pipeline forces the designer to enumerate variation; an agent can navigate it. We have seen extraction precision lift by 8–14 percentage points on legacy document estates when moving from fixed to agentic extraction.

2. Research and synthesis

Where the output is a document that must cite, cross-reference and summarise a large corpus. Due diligence packs, scrutiny papers, market reports. The agent can retrieve, decide whether it has enough evidence, and iterate. The non-agentic alternative requires a very long prompt that performs worse.

3. Orchestration of many small tools

Customer service, where the agent chooses between "look up order", "issue refund", "transfer to human", "update address" based on the conversation. A fixed state machine covers the top five intents but gets brittle at the long tail; an agent handles the long tail more gracefully, with the state machine as a floor.

Where enterprises overreach

Overreach 1 — replacing workflow engines

Camunda, Temporal, Airflow, ServiceNow Flow — mature tools with deterministic behaviour, replay, observability. "Let's replace it with an agent" is a common 2026 impulse and almost always a mistake. Workflow engines are the floor; agents sit at a few specific points where variability is the point.

Overreach 2 — unbounded tool permission

Giving an agent write-access to production systems — raise an invoice, issue a refund, update a record — without bounded, auditable scopes, rate limits, and a human-in-the-loop for anything above a threshold. Three incidents in our client base last year had the same shape: agent, too many tools, insufficient scoping, unexpected action at scale.

Overreach 3 — agents of agents

Multi-agent architectures where a "planner" delegates to "researcher", "writer" and "critic" agents. Some academic traction. Little reproducible enterprise value in our experience. One well-scoped agent with the right tools beats three agents passing prompts.

What good looks like

A bounded tool catalogue. Tools scoped to specific actions, with input validation, rate limits, and logging.
A deterministic fallback. If the agent fails or loops, a human path or deterministic path takes over. Never silent.
Evaluation discipline. A golden dataset of inputs and expected outputs. Run on every tool change, prompt change, model change.
Observability. Every tool call, every reasoning step, logged and searchable. Debugging an agent blind is genuinely impossible.
A named owner. Not "the AI team". A specific person accountable for the agent's accuracy, cost and behaviour month by month.

The cost question

Agentic workflows are more expensive than pipelines. Each task may involve 5–30 LLM calls. At scale this matters. In our deployments, cost-per-task ranges from £0.02 (small model, short task) to £1.20 (large model, research task). Always model the cost per 10k tasks before building.

When to build it

Build agentic if: the input distribution is genuinely heterogeneous; the action space is narrow and reversible; the output can be evaluated against ground truth; a human reviewer is in the loop for anything consequential. Otherwise, a pipeline is cheaper, more reliable and easier to maintain.