A new review paper from researchers at the University of Illinois Urbana-Champaign, Meta, and Stanford argues that code is the foundation on which AI agents reason, act, and coordinate. The paper emphasizes that the real bottleneck for autonomous systems lies in the software layer surrounding the model, referred to as the 'harness.' This layer includes tools, interfaces, sandboxed environments, memory, testing, permission boundaries, execution loops, and feedback channels. Without it, a language model remains stateless. With it, the model becomes a working agent capable of executing tasks over extended periods. The paper outlines how code serves as an executable, testable, and stateful layer between the model and its environment, enabling reliable and continuous operation. It also describes three layers organizing the field: the model's capabilities, the infrastructure, and the code the agent writes on the fly, including test scripts, helper tools, and reusable skills. Commercial systems like Claude Code and OpenAI's Codex already operate on this principle, but the authors warn that current software tests are often incomplete, potentially obscuring risks. They stress the need for more transparent evaluation mechanisms. *Source: [thedecoder](https://the-decoder.com/new-review-paper-argues-code-is-how-ai-agents-think-and-act-not-just-what-they-produce/)*