Merck and Mastercard are seeing real agent AI results. Both say plumbing came first.



Merck uses AI agents to cut drug discovery cycles by a third and deliver relevant marketing materials up to 80% faster – but VP of Digital Platforms Sean Finnerty says the only reason it works is because they built the infrastructure first.

And the pharmaceutical manufacturer is seeing promising early results: AI is creating marketing plans that are “99% accurate” when it comes to compliance, reducing review cycles from months to days, and speeding up delivery by 70% to 80%. In the company’s medical research, meanwhile, an AI-assisted discovery cycle was reduced by 33%.

Finnerty said of digital platforms and services at a recent AI Impact Series event: Still, agent AI only works if companies build the basic “plumbing” first.

“If we do it all at once, we’ll end up with thousands and thousands of things that will end up with debt that we’ll only deal with later,” he said. “And that would be a drag on any innovation.”

Starting with plumbing

Merck’s plumbing-first strategy stems from lessons learned in the early days of the cloud in the 2010s, when “nobody knew what was going on,” Finnerty said.

Getting the cloud right meant building from the ground up; At Merck, that infrastructure now supports 2,500 AWS accounts, multiple Microsoft Azure subscriptions, and new Google Cloud Platform (GCP) integrations.

“AI is going to be the same thing,” Finnerty said. “We’ll have thousands and thousands of agents.” Then the questions arise: How do you register them? How do you provide them? How do you ensure they are connected to the right tools and have access to the right information and in the right context?

Context delivery is also important; Merck operates three hyperscalers and has forty-seven edge locations and hundreds of databases. “Many, many petabytes” of structured and unstructured data are stored in Oracle databases, SQL databases, Excel spreadsheets, phone transcripts and other repositories, Finnerty said.

His team builds scaffolding to provide meaningful context in different situations, he explained. Data must be organized and entered into various platforms because “there is no one solution to every problem.” Sometimes it’s Databricks, other times it’s Amazon Redshift, “plus four other things.”

The goal is: “Let’s make it easy and frictionless for people to do and make it secure and make sure it integrates well with MCP (model context protocol) and A2A (Agent2Agent) and upstream computing,” Finnerty said. “If you want to run things on GCP or run on AWS, we have the plumbing so you can run your neighboring workloads wherever you want.”

How Merck uses agents

In building its technical plumbing, Merck is experimenting with agents to modernize regulated enterprise operations, scientific discovery workflows, and applications.

It should be noted that artificial intelligence accelerates drug discovery. Scientists look at molecular structures and disease states to determine whether a particular condition is amenable to medicine, Finnerty explained. But even if a disease state is known, it can take years to develop a drug to target it.

Now, with AI, teams are starting to see “very promising things,” like cutting a given research cycle by a third. “It’s a year into the life of the discovery period,” Finnerty said. “This means that, in theory, we could deliver this treatment to a patient who needs it a year sooner.”

Once developed and approved, these products are regulated and the marketing materials around them must be clearly and unambiguously stated. “Your delivery of that information by market, by country, by state, by region, is very carefully managed and regulated,” Finnerty said. It’s also variable: the vaccine ad campaign in Georgia looks very different from the one launched in Canada.

Historically, people have done their due diligence to make sure a company is in compliance with various laws. Draft materials undergo repeated reviews; When an error is detected, it “is brought back to the beginning and it goes through again and then it takes more weeks and months,” Finnerty said.

But now AI can do it “much, much more efficiently,” and the process is increasingly inherently more advanced than a human in the loop. "man as governor." With human supervision, AI can deliver a 99% first draft within a day or week, allowing teams to ship materials up to 80% faster.

Meanwhile, when it comes to application modernization, AI can detect architecture, data interactions, APIs, network paths, authentication checks and authorization; he can also write code for Terraform for deployment and refactor JavaScript in Python.

Where the company previously spent weeks, months and hundreds of thousands of dollars updating an app, agents now manage the work through tasks, Finnerty said.

Escape "stupidity"

This is not to say that there are not significant challenges; Finnerty noted that his team encountered some “craziness”; for example in automated code and scenario testing. If the AI ​​is in the wrong context, infrastructure, or just plain creative, it’s clearly making scenarios for the reason ‘you should test these three features that don’t even exist in the code you’re trying to test’.

“It surprised me a little bit because I thought we’d gotten past some of the hallucinations in these later models,” he said.

To address this, his team designed guardrails to minimize hallucinations, using AI to control AI and applying confidence scores. So, if Claude created the first output, they will have Microsoft Copilot evaluate it.

“So if you ask something once, check it with the AI, then ask it a third time, the confidence increases each time, and it minimizes some of the garbage that occurs in the first tests,” Finnerty said.

Use cases for agent AI in financial services

Meanwhile, Andrew Reiskind, Chief Data Officer at Mastercard, and his team focus on agent experimentation over highly orchestrated transaction and dispute workflows. As he points out, a chargeback or fraud dispute is not an isolated incident.

When a consumer disputes a payment (usually online), it “starts another process that’s very labor-intensive on the back end,” Reiskind said.

Mastercard must collect specific information regarding the actual dispute; then the merchant has their own investigations (Has the card been reported lost or stolen? Does a consumer dispute often require a charge?). In addition, the network sitting in the middle has its own rules for timing and data submission.

“You have each of these steps, many of which are unstructured, but also have structured data elements associated with it,” Reiskind said. A lost or stolen card is usually structured, but a consumer complaint is “unstructured data of questionable reliability.”

“So you’re sitting there with a decision-making system that has deterministic decisions but also probabilistic decisions,” he said.

This problem can be accelerated and potentially solved by AI agents, but it can be a complex process: What tasks do you give the agents? When do they return things to human representatives? How many agents do you end up using? What are the cost implications?

Then there are the reputational questions and costs: Did you just potentially call the consumer a liar when they weren’t?

“It’s a clear challenge that as a bank you want to maintain trust with the consumer,” Reiskind said. “But you also want to do it efficiently and take costs out of the system.”

PB&J and the turkey bug: Determine what risks are acceptable

Reiskind said there will always be risk associated with artificial intelligence, and businesses should assess this from the beginning of product design. There is also the issue of acceptable risk.

For example: Have you given a customer a peanut butter jelly sandwich instead of a turkey sandwich (a minor inconvenience)? Or have you given gluten to someone with celiac disease?

“If one percent is wrong, is that an acceptable risk? If so, let’s move on to the next step of how you mitigate that risk,” Reiskind said.

Leaders must conduct a cost-benefit analysis, break problems down into “components” and calculate the costs for each. But these are estimates; Real usage is nearly impossible to predict, Reiskind said. “Achieving costs is not an easy process,” he said. “But it’s possible.”



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *