LangChain’s CEO claims that better models alone won’t get your AI agent into production



As models become more intelligent and capable, "trailers" it should also develop around them. This "harness engineering" is an extension of context engineering, he says LangChain co-founder and CEO Harrison Chase the new VentureBeat Beyond the Pilot podcast episode. While traditional AI harnesses tend to limit models from looping and calling tools, harnesses built specifically for AI agents allow them to interact more independently and perform long-term tasks effectively.

Chase too Focused on OpenAI’s acquisition of OpenClawargued that his viral success stemmed from a desire "let it break" in ways that no big lab can — and to question whether the acquisition actually moves OpenAI closer to a secure enterprise version of the product. “The trend in plugins is really to give the large language model (LLM) itself more control over the context engineering, to decide what it sees and what it doesn’t,” says Chase. “Now the idea of ​​a long-term, more autonomous assistant is real.”

Track progress and maintain consistency

Chase noted that while the concept of allowing LLMs to run on a loop and ring instruments seems relatively simple, it is difficult to pull off reliably. For a while, the models were “below the limit of usefulness” and simply couldn’t work in a single loop, so developers used graphs and wrote chains to get around this. Chase pointed to AutoGPT, once the fastest-growing GitHub project, as a cautionary example: the same architecture as today’s best agents, but the models weren’t yet good enough to run reliably in a single loop, so it quickly fizzled out. But as LLMs continue to improve, teams can build environments where models can run in loops and plan over longer horizons, and they can continually improve these trailers. Previously, “you couldn’t really do improvements on the trailer because you couldn’t really drive the model on the trailer,” Chase said. LangChain’s answer to this is customizable Deep Agents general purpose trailer. Built on LangChain and LangGraph, it has scheduling capabilities, a virtual file system, context and token management, code execution, skill and memory functions. In addition, it can delegate tasks to subagents; these are specialized with different tools and configurations and can work in parallel. Context is also isolated, meaning that subagent work does not mess with the parent agent’s context, and large subtask contexts are compressed into a single result for remarkable efficiency. Chase explained that these agents all have access to file systems and can create to-do lists that they can execute and track over time. “When it goes to the next step and goes from a 200-step process to step two or step three or step four, it has a way to track progress and maintain that consistency,” Chase said. “It comes down to letting the LLM essentially write their ideas as they go along.” He stressed that trailers should be designed so that models can maintain consistency over longer tasks and be “suitable” for models that decide to remove context at points it determines are “useful”. Also, giving agents access to code interpreters and BASH tools increases flexibility. And providing agents with skills, as opposed to just preloaded tools, allows them to download information as needed. “So rather than hard-coding everything into one big system request," Chase explained, "you can make a smaller system query, ‘This is the basic foundation, but if I need to do Xi, let me read the skill for X. If I have to do Y, I read the skill for Y.’"

In fact, context engineering is a “really fancy” way of saying: What does LLM see? Because it’s different from what developers see, he said. When human developers can analyze agent traces, they can put themselves in the AI’s “mindset” and answer questions like: What is a system command? How was it created? Is it static or populated? What tools does the agent have? When it calls the tool and gets a response, how is it presented? “When agents mess up, they mess up because they don’t have the right context; when they succeed, they succeed because they have the right context,” he said. “I think of context engineering as bringing the right information in the right format at the right time to the LLM.” Listen to the podcast to hear more about:

  • How LangChain built its stack: LangGraph as the main column, LangChain in the center, Deep Agents on top.

  • Why code sandboxes will be the next big thing.

  • How a different kind of UX will evolve as agents work longer intervals (or continuously).

  • Why traces and observability are key to creating an agent that actually works.

You can also listen and subscribe Off the pilot about Spotify, apple or wherever you get your podcasts from.



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *