How Anthropic Builds Procurement with Claude

There is a certain irony in Anthropic, the company that makes Claude also being one of its most instructive enterprise users. At DPW New York, Katie Streu, Anthropic's Head of Procurement, did something rare on a conference stage: she skipped the vision deck and went straight to the plumbing. No abstract AI futures. Just a working procurement team, a real set of problems, and three years of learned lessons about what it actually takes to deploy AI agents inside a finance function.

The session was titled How Anthropic Is Building Procurement with Claude. It should probably also carry a subtitle: And why most enterprise AI deployments get the sequence wrong.

The Problem Was Never the AI

Streu opened with a blunt observation that every enterprise transformation leader should hear: the teams who struggle with AI agents are usually not struggling because of the technology. They're struggling because they tried to drop a capable model onto a broken process and called it innovation.

Procurement is a function that tends to accumulate entropy quietly. Repetitive Slack questions about invoice processes, purchase orders, submission portals. Metadata errors on purchase requests. Contract risk sitting unrouted in someone's inbox. Legal coding mismatches that only surface after a General Ledger entry. None of these are glamorous problems. But they are the problems that collectively drag a high-value team toward low-value work.

Anthropic's procurement team was not exempt from this. Purchase request volume grew roughly 88% in a single year. Each request required metadata verification, contract review, and GL coding checks. Necessary work, but not the work a senior procurement professional was hired to do.

The instinct in most organisations is to throw automation at it and move on. Streu's team did something harder: they fixed the process first, then designed the agent around the clarity that followed.

Three Agents, Not One

Intake Validation and Sensitive Data agents flagging field mismatches and classification errors in Anthropic's procure-to-pay platform. — Three agents, three scopes — the Intake Validation Agent catches metadata mismatches while the Sensitive Data Agent flags classification errors. Humans approve every gate.

The most practically instructive part of the session was the architecture Streu's team built inside their procure-to-pay (P2P) platform. Rather than a single monolithic AI assistant, they deployed three focused agents, each with a distinct scope:

Intake/QA Agent: When a purchase request arrives incomplete, this agent identifies what's missing and asks the requester directly. No more back-and-forth Slack threads or email chains requesting clarification.

Legal Triage Agent: Contract risk is routed based on the nature of the agreement. The agent doesn't adjudicate, it classifies and escalates, ensuring that legal review capacity is allocated to the cases that actually need it.

Data Validation Agent: Before a transaction hits the GL, this agent checks coding accuracy. It catches errors that would otherwise require costly downstream corrections or audit flags.

The critical design principle threading all three: humans approve every gate. The agents flag, suggest, and escalate. The reviewers still review. This distinction, agent as intelligent filter, not autonomous decision-maker, is not a limitation. It is the architecture. It reflects a mature understanding that trust in AI systems is earned incrementally, and that a false positive in a procurement workflow has real downstream consequences.

Procurement Pal: The Case for Starting With the Obvious

Procurement Pal Slack bot answering an invoice payment question in the #ask-procurement channel, grounded in Anthropic's procurement policy — Procurement Pal responds to a common invoice query in under a minute — structured, policy-grounded, and deployed at scale across a 1,247-member Slack channel.

Alongside the workflow agents, Streu's team built something deceptively simple: a Claude-powered Slack bot called Procurement Pal, deployed in a channel called #ask-procurement. Its job is to answer the questions that eat up 20 minutes of a specialist's afternoon: How do I get an invoice paid? Do I need a PO for this? Where do I submit?

The bot answers in under a minute, grounded in Anthropic's actual procurement policy documentation. The slide showed a response to "How do I get an invoice paid?" that was structured, accurate, and sourced - three numbered steps, clear thresholds, specific system references. The kind of answer a good procurement analyst would give, deployed at 1,247-member channel scale.

What makes this significant is not the sophistication of the technology. It's the recognition that the highest-leverage AI deployments are often the most unglamorous ones. Eliminating repetitive query load is not a moonshot. It is just good operations. And it compounds, every hour reclaimed from answering the same question is an hour reinvested in supplier strategy, risk analysis, or contract negotiation.

Four Essentials for Building

Four-card grid of Anthropic's AI building principles: start simple, fix the process, keep humans in the loop, secure stakeholder buy-in. — Katie Streu's four building principles — distilled from real deployment experience, each one pushing back against a common enterprise AI failure mode.

Streu closed the technical section with four principles distilled from the team's experience. They are worth quoting directly, because each one pushes back against a common enterprise AI failure mode:

01. Start simple. If you're not ready to make it complex, don't. Iterate in the open. The temptation to launch with a fully integrated, multi-agent orchestration layer is real and usually counterproductive. Ship something narrow and useful, then earn the right to expand scope.

02. You can't drop an agent on a broken process. Garbage in, garbage out start with the process, not the prompt. This is the most frequently ignored principle in enterprise AI deployment. A well-prompted model cannot compensate for an ambiguous workflow, unclear ownership, or inconsistent input data.

03. Human in the loop is non-negotiable. The agent's job is to flag what needs attention. The reviewer still reviews. This is not a hedge it is a design choice that makes the system auditable, correctable, and trusted. It also creates the feedback loop necessary for the agent to improve over time.

04. Stakeholder buy-in is key. Unspoken in the bullet, but clear in context: AI deployments that bypass finance controllers, legal, and operational stakeholders fail at the change management layer, not the technical one. The process redesign is as much organisational as it is computational.

The AI-Native Procurement Vision

In her closing remarks, Streu outlined what she called the AI-native procurement vision, defined by three shifts:

Three cards defining AI-native procurement Adoption, Automation, and Assurance each with a sketch icon and one-line description. — The AI-native procurement vision isn't about replacing the function — it's about embedding intelligence across every layer of it.

Adoption: Every team member uses Claude daily. Not a power-user cohort. Not a pilot group. Everyone.

Automation: Claude becomes the solution, not always the identifier of the problem. The function moves from reactive troubleshooting to proactive orchestration.

Assurance: The output is audit-ready, compliant, and verifiable. AI-native does not mean less governed. It means governance is embedded in the workflow rather than bolted on afterward.

The closing line of the session landed cleanly: Don't optimise for perfect. Optimise for starting.

What This Means Beyond Procurement

For practitioners working at the intersection of AI strategy and enterprise operations, the Anthropic procurement story is instructive precisely because it is not exceptional. The problems - volume growth, process entropy, repetitive query load, coding errors are universal. What differs is the discipline with which the team approached sequencing: process clarity before agent deployment, narrow scope before broad orchestration, human approval before autonomous action.

The temptation in most organisations is to start with the most ambitious version of the use case and work backwards. Streu's team did the opposite. They started with the most irritating version of the problem and worked forwards.

That is, arguably, the most transferable lesson in the room.