Case study · Accounting

AI triage on 42,000 client queries a year — freeing 9,000 partner-hours.

A UK Top-50 accountancy firm was losing partner time to recurring client queries. We built a RAG agent that auto-drafts responses with citations, routes exceptions to partners, and cleared the drag on advisory capacity.

73%
Auto-resolved
9,000h
Partner time freed/yr
£680k
Advisory uplift
11 wks
Kick-off to live
Overview

Advisory work was being crowded out by routine.

Partners were spending hours a week on queries the firm had answered hundreds of times. Client satisfaction was solid; partner capacity for higher-value advisory was not. The firm wanted to fix that without adding headcount.

We scoped the problem, built a production-grade RAG system integrated with their practice-management platform, and landed it inside 11 weeks.

Approach

Production, not a prototype.

01

Week 1: scoping + evals

Collected 800 ground-truth queries, built offline eval harness before prompt v1.

02

Weeks 2–3: retrieval

Ingested firm knowledge base (VAT, payroll, R&D, corporate tax), hybrid retrieval with re-ranking.

03

Weeks 4–8: agent & UI

Tool-using agent, structured outputs, human-review workflow, partner dashboard.

04

Weeks 9–10: red-team & rollout

Safety testing, partner training, phased rollout across three offices.

05

Ongoing: operate

Quality monitoring, retrieval refresh, cost optimisation.

73%
Queries auto-resolved
9,000h
Partner hours freed / year
£680k
Annual advisory uplift
−71%
Avg. response time
0
High-stakes errors in 6 months
4.6/5
Client satisfaction
Technology

Tools & frameworks we use.

Azure OpenAI
Claude
LangGraph
pgvector
Langfuse
Ragas
Microsoft Entra ID
Azure App Service
FAQ

Common questions.

What about hallucinations?+
Structured outputs with citation-required schema; answers without retrieval hits escalate to human. Evaluation harness gates every prompt change before deploy.
Did partners trust it?+
Only after the rollout showed parity with partner-drafted responses in blind review. Trust was earned, not assumed.
What's the cost model?+
Per-session unit economics tracked in Langfuse. Average cost per auto-resolved query is 1% of the partner-hour cost.
Ready when you are

Put your data to work.

Book a free 30-minute consultation with a senior Databuzz consultant.