D&B’s database of 642 million businesses is built for humans, not AI agents. So they rebuilt it.



Dun & Bradstreet has spent more than 180 years building a comprehensive commercial database. Its Commercial Graph, covering 642 million businesses and their relationships, corporate hierarchies and risk profiles, is designed for people. Credit analysts, risk managers and sales professionals who can wait for inquiry results and work with uncertain entity matches. AI agents can do none of these things.

As D&B’s customers began pushing agents into credit, procurement and supply chain workflows, Commercial Graphics, which reliably serves nearly 200,000 customers globally, became challenged. Systems built to serve human analysts were the wrong architecture for machines. So D&B was rebuilt.

"We ask about agents from our standard credit analysts or sales and marketing specialists, etc. as our new consumer category and now we have to serve the agents of these customers." Gary Kotovets, Chief Data and Analytics Officer at Dun & Bradstreet, told VentureBeat.

What went wrong when the agents started questioning

Commercial Graph was not a single database. It was a collection of separate systems held together by custom integrations, built for different use cases and different markets. Human analysts handled this decomposition through SQL queries or pre-built interfaces. Agents failed.

The scale of the underlying data exacerbated the problem. According to D&B, the database nearly doubled in five years, expanding from more than 300 million business records to more than 642 million business records, with 11,000 fields per record. As records move through their systems, the company now quality checks about 100 billion pieces of data per month. It was not possible to perform the polling required by the second latency agents against the fragmented architecture.

The connections the graph was tracking were also the wrong type. Older systems recorded static relationships between entities. The CEO was associated with a company. This was the line. Agents working on credit evaluations or third-party risk need dynamic relationships: when that CEO leaves for a new company, what organization do their experiences match? When a subsidiary changes ownership, how does it propagate through the corporate hierarchy? These questions used to require special analytical work. Agents cannot wait for specific analytics work.

The wider problem is not unique to D&B. Kotovets said he’s spoken to hundreds of CDOs and CIOs over the past six months and consistently heard the same limitation: they couldn’t build what they wanted in AI because the databases weren’t standardized, normalized or agent-queryable. D&B had this foundation built over decades to serve human analysts. It still had to be rebuilt for agents.

What did they actually build?

Reconstruction began with consolidation. D&B migrated fragmented databases to cloud infrastructure, redesigned the underlying schema, and built a data material layer that normalized records across markets while maintaining regional compatibility requirements. The result is a unified knowledge graph that tracks billions of relationships across 642 million companies, continuously updated and enriched by AI-based data processing.

On top of this chart, D&B built a structured access layer for agents. Raw SQL output was not the answer to agent query volumes and latency requirements. Instead, D&B created a set of tools and capabilities available through MCP that package data into the right records for specific queries with context and routing agents. A matching and entity resolution engine sits behind each query and ensures that when an agent asks about a company, the answer is resolved with a verified, specific entity, not a name match.

D&B tackled agent identity from both directions

Rebuilding the chart and adding the MCP entry solved the data retrieval problem. This did not solve the identity problem. Agents are not people, and the authentication model built for human users does not apply to machines.

D&B has developed a new registration model for agents. They must map to an authenticated IP address and register an individual access key that is treated as an authenticated identity in the same pipeline as a human user.

"Similar to Know Your Customer, we have a Know Your Agent concept that performs these additional checks." Kotovets said.

This solves the inbound problem: knowing what company the agent belongs to and what information they have the right to request. But D&B is also built for the exit problem: what happens when a client loses track of the company it’s analyzing its multi-agent workflow.

In a workflow that combines a credit check agent, a KYC agent, and a third-party risk agent, each queries the D&B at a different step. Without a mechanism to confirm that they all refer to the same entity, the workflow may complete while working on different records.

"They still need to go back to our validation agent to make sure they’re talking about the same entity as each other." Kotovets said. "In a way, it’s almost like a digital handshake."

D&B’s business validation agent can be embedded into any workflow as a continuous reference point and is available in Google’s A2A protocol, regardless of which orchestration tool the customer uses.

Four things enterprises should get before deploying AI agents

The redesign revealed requirements beyond D&B’s own stack.

  1. Databases come before agent infrastructure. The CDOs and CIOs Kotovets has spoken to over the past six months have consistently hit the same wall: They can’t build what they want in AI until their data is clean, normalized and consolidated. D&B already had this foundation. Most businesses don’t, and they will feel it.

  2. Design for dynamic relationships, not static ones. Enterprise information systems typically record time-to-time relationships: a person belongs to a company, an asset belongs to a subsidiary. Agents working on credit, risk or supply chain decisions must justify changing relationships over time. If the master data only captures the static line, so will the agent.

  3. Build entity consistency checks into multi-agent workflows. When multiple agents touch the same object in different steps, there is no guarantee that they all refer to the same record until the workflow is complete. This space should be clearly designed. Entity validation is a design requirement of the workflow, not an optional safeguard.

  4. Place the offspring from the beginning, not later. Each response produced by an agent must carry a traceable path back to its source. The cost of error in credit, risk and supply chain decisions is concrete. Generation should be established before scaling, not added after the problem surface.

"You can always click to see where it came from and trace it back to the original source." Kotovets said. "This has been instrumental in opening up many other opportunities for us because we have confidence in what we do."



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *