
Enterprise AI is entering a new phase – where the key question is not what can be built, but how to get the most out of our investment in AI.
In a recent VentureBeat AI Impact Tour session, Brian Gracey, director of portfolio strategy at Red Hat, described the operational reality at large organizations: the proliferation of artificial intelligence, rising costs of output, and limited visibility into what those investments are actually returning.
This is the “Day 2” moment – when pilots lead to production and costs, management and sustainability become more difficult than building the system in the first place.
"We’ve seen customers say, “I have 50,000 Copilot licenses.” I really don’t know what people get out of it. But I know I’m paying for the most expensive computing system in the world because it’s GPUs." He said politely. "’How am I going to manage this?’"
Why enterprise AI spending is now a boardroom issue
For much of the past two years, the main issue for organizations evaluating generative AI has not been cost. The experimental phase allowed teams to spend freely and the promise of productivity gains justified aggressive investment, but that dynamic is changing as enterprises enter their second and third budget cycles with AI. The focus has shifted from here "can we build something?" for "do we get what we pay for?"
Enterprises that make big, early bets on managed AI services are doing serious research on whether those investments are delivering measurable value. It’s not just that GPU computing is expensive. That’s because many organizations lack the tools to link costs to results, making it nearly impossible to justify upgrades or scale responsibly.
Strategic shift from token consumer to token producer
The dominant AI procurement model of the past few years has been simple: pay the vendor for a token, seat, or API call, and let someone else manage the infrastructure. This model made sense as a starting point, but it is increasingly questioned by organizations with enough experience to compare alternatives.
Enterprises past an AI era are starting to rethink this model.
"Instead of just being a token consumer, how can I start being a token generator?" He said politely. "Are there use cases and workloads that make sense for me to have more? This could mean the GPUs are running. This may mean renting GPUs. And then it asks, ‘Does this workload need a state-of-the-art model? Are there more capable outdoor models or smaller models that would suit?’"
The decision is not binary. The right answer depends on the workload, the organization, and the risk tolerance involved, but as the number of capable open models grows, from DeepSeek to models available in cloud marketplaces, the math gets more complicated. Enterprises now actually have real alternatives to the few providers that dominated the landscape two years ago.
The falling costs and growing use of AI creates a paradox for enterprise budgets
Some business leaders point to Anthropic CEO Dario Amodei’s announcement that AI-related costs are falling by about 60% a year, arguing that locking in infrastructure investments now could mean a significant overpayment in the long run.
The emergence of open source models such as DeepSeek and others has significantly expanded the strategic options available to enterprises looking to invest in core infrastructure over the past three years.
But while costs per token are falling, usage is accelerating rather than increasing efficiency. This is a version of the Jevons Paradox, the economic principle that improvements in resource efficiency tend to increase rather than decrease total consumption because lower costs allow for wider application.
For enterprise budget planners, this means that reductions in unit costs do not translate into reductions in total bills. An organization that triples its use of AI while halving its costs continues to spend more than before. It considers which workloads really require the most capable and most expensive models, and which can be better handled by smaller, cheaper alternatives.
The business case for investing in AI infrastructure agility
The prescription is not to slow down AI investment, but to build with agility in mind. The organizations that will win are not necessarily those that move the fastest or spend the most; they are the ones who build infrastructure and operating models capable of absorbing the next unexpected development.
"The more abstractions you can build and give yourself some flexibility, the more you can experiment without spending too much money and putting your business at risk. These are just as important as asking whether you are doing things in the best way right now," He politely explained.
But despite how robust the AI debate is in enterprise planning cycles, most organizations’ hands-on experience is still measured in years, not decades.
"It feels like we’ve been doing this forever. We’ve been doing this for three years." Kindly added. "It’s early and it’s going very fast. You don’t know what will happen next. But the specifics of what’s to come—you have to feel what it’s like.”
For enterprise leaders still calibrating their AI investment strategies, this may be the most effective way out: the goal is not to optimize for today’s cost structure, but to create organizational and technical flexibility to adapt when it changes again.





