Databricks says it solves a decades-old data pipeline problem that slows down AI agents.



For decades, data professionals have struggled with the challenge of managing both operational and analytical databases in a unified approach that does not cause latency or performance degradation.

Agents structured the problem. A system that continuously thinks and acts on live data cannot tolerate a pipeline between itself and the data it must act upon.

At the Data + AI Summit on Tuesday, Databricks announced two products aimed at collapsing this infrastructure. Lakehouse//RT directly provides millisecond query latency on managed Delta and Iceberg tables, eliminating the real-time level of service that enterprises maintain alongside lakehouses. LTAP, short for Lake Transactional/Analytical Processing, stores Postgres-native transactional data in Delta and Iceberg formats from the point of writing, removing decades of ETL pipelines that have connected transactional and analytical systems.

Databricks co-founder Reynold Xin described a simpler data stack "sacred butterfly for agents" In a briefing with VentureBeat, he argued that as users code more applications, the analytical thinking agents on top of those applications need the underlying infrastructure to move quickly.

"Agents actually prefer a simpler stack because they can move faster." he said.

LTAP bets on the combination of the memory layer where HTAP tests engine convergence

Many vendors have tried different approaches to combining analytics and operational data over the decades.

Back in 2014, analyst firm Gartner used the term HTAP, short for Hybrid Transactional/Analytical Processing, to describe vendors attempting to combine the two types of databases. Vendors including MemSQL (now known as SingleStore) SAP HANA and Oracle MySQL heat wave is among the many HTAP vendors on the market.

LTAP is Databricks’ answer to HTAP, using the Lakebase architecture to consolidate data at the storage layer rather than at the engine level. Lake base Databricks is a serverless cloud-based PostgreSQL database service that was made generally available in February.

"HTAP is more of an industry failure than a success for us." Xin said.

The LTAP approach moves to the storage layer instead of the query layer. Lakebase previously stored Postgres data in object storage in Postgres format, which Lakehouse’s analytics engines required to convert before they could use it effectively. With LTAP, transactional data resides directly in Delta or Iceberg format, sharing the same copy read by analytical workloads. Postgres remains the transaction engine. Spark and Lakehouse remain analytics engines.

"The point is that you use the best tool for the job at the query engine level, we just make sure that main memory has a copy of the data," Xin said.

A central engineering problem is latency. Object storage has response times in the sub-second range, which is too slow for OLTP workloads that require sub-millisecond performance. Lakebase handles this through a caching layer between Postgres compute instances and object storage. A key design decision is where the column conversion occurs: free CPU capacity in this caching layer performs the row-to-column conversion before the data falls into object memory.

"When you convert data from row to column, it typically compresses more than 10 times, so now you significantly reduce the network cost of that underlying caching layer between that caching layer and the object stores," Xin said.

Lakehouse//RT provides millisecond query latency on live lake data without a separate service tier

Lakehouse//RT is Databricks’ answer to a dedicated real-time service level—discrete systems enterprises have kept alongside their lakehouses to handle low-latency queries at the expense of data copies, distributed management, and pipeline complexity agents cannot run. Key features of Lakehouse//RT include:

Raiden calculation engine: Built specifically for high concurrency, low latency, Reyden queries Delta and Iceberg tables directly without moving data across the lake.

Latency and throughput: Lakehouse//RT delivers sub-100ms latency with 12,000 requests per second, response times as low as 10ms on smaller databases, and 16x better performance than existing dedicated service stacks.

Administration and access to information: Each request works within the Unity Catalog’s management framework without a separate permissions layer, data copies, or ingest pipelines.

Analysts see the agent framework and open format approach as real differentiators

The problem both products address is well-documented among enterprise data groups, but analysts draw a distinction between the pain point and Databricks’ specific claim.

"Enterprises have had HTAP, streaming, cloud storage and transaction stores for years." Stephanie Walter, AI Stack Practice Lead at HyperFRAME Research, told VentureBeat. "What is different is the agent AI framework."

Agents need live transactional data, historical context, management, search and writeback in the same workflow, Walter noted.

"That’s a strong architectural argument, but Lakebase still needs to prove it can meet the latency, reliability, and operational maturity that CIOs expect." he said.

Mike Leone, an analyst at Moor Insights and Strategy, said the path to true differentiation is more specific than the merger concept itself. He also noted that open analytics in the data lake is now a table stake where many vendors provide some sort of service.

"A less common move is to allow transactional writing in open formats as well, so that the transactional database doesn’t sit in the property box when only the analytics half is open. "Leone told VentureBeat.

He added that with Lakehouse//RT, an open-format approach that interrogates live data directly off the lake gives architecture a strong case to override all specialized systems.

The technical claim that will be examined the most is also the most central claim. "The piece I still want their engineers to go through is that both engines actually share a copy without a silent conversion step that syncs in the middle." Leon said.

What this means for businesses

For data engineers evaluating stacks for agent workloads, the question is no longer which best-of-breed tool to run for each job—it’s whether running individual tools can still be defensible.

Enterprises building separate operational databases, real-time service tiers, and analytics lake houses may have previously viewed the gaps between them as a maintenance burden. Agents expose these gaps as operational risk: a system that thinks across management boundaries will find inconsistencies faster than any human team.

The market is moving away from specialized service layers faster than most vendor roadmaps anticipate. according to VB Pulse Q1 2026In a three-wave longitudinal survey of more than 100 employee organizations, hybrid search intent tripled during the quarter, from 10.3% to 33.3%, while stand-alone vector database adoption declined in every vendor tracked. The same consolidation logic now reaches the real-time service level.

The traditional approach—best-of-breed tools for each workload type, with pipelines between them—is built for human-speed analytical consumption. Agent workloads do not tolerate this architecture.

"The pain they point to is that all replication and synchronization between operational and analytics systems is real and expensive, and anyone running it at scale feels it." Leon said.



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *