studio.

A model in a loop

An agent is a language model that can take actions and observe their results. Instead of receiving a single prompt and producing a single response, an agent operates in a loop: it reads the current state, decides what to do next, executes an action (calling a tool, searching a database, writing a file, making an API request), observes the result, and decides again. The loop continues until the agent determines the task is complete. Every step in the loop is still just a language model predicting the next token. The difference is that some of those tokens are interpreted as tool calls rather than text output, and the results of those calls are fed back into the context window for the next iteration.

Tools define what agents can do

An agent without tools is just a chatbot in a loop. The tools you give an agent define its capabilities: a search tool lets it look things up, a code execution tool lets it run calculations, a file system tool lets it read and write documents, an API tool lets it interact with external services. The agent decides which tool to use based on its instructions and the current state of the task. Crucially, the agent can only use tools you have explicitly provided. It cannot reach outside its tool set. An agent with access to a search tool and a text editor is fundamentally different from an agent with access to a database and an email sender, even if the underlying model is identical.

What the discourse gets wrong

Public conversation about agents is dominated by two distortions. The first is autonomy. Agents are routinely described as autonomous systems that independently decide what to do, plan complex strategies, and execute them without human involvement. In practice, an agent is a model following instructions inside constraints you defined, using tools you provided, operating within an environment you designed. It is autonomous in the same sense that a dishwasher is autonomous: it runs a cycle you configured, using resources you loaded, and stops when the cycle ends. The second distortion is capability. Marketing language treats "agentic" as a synonym for "intelligent" or "superhuman". An agent that can search the web, write code, and send emails sounds powerful. In practice, it is a model that calls three APIs in a loop. The model is still predicting probable text. It still hallucinates. It still follows the path of least resistance shaped by its training. Giving it tools does not give it judgment.

Where agents actually help

Agents are genuinely useful for tasks that require multiple steps where the specific sequence cannot be determined in advance. A repeatable workflow (module 10) has fixed steps: do A, then B, then C. An agentic workflow adapts: do A, examine the result, decide whether B or C is the right next step, execute it, examine again. A research agent that searches for information, reads the results, identifies gaps, searches again with refined queries, and compiles findings is doing something that would be tedious to script as a fixed pipeline because the search path depends on what it finds. The value is in the adaptive loop, not in any single step.

Agents fail predictably

Agent failures follow patterns. Vague goals produce wandering behaviour: the agent takes actions that are plausible but unproductive because it has no clear criterion for when the task is done. Broad tool access produces risky behaviour: an agent with write access to a production database and a vague instruction to "clean up the data" can cause damage that a human would never permit. Long loops produce drift: after many iterations, the agent's context window fills with previous steps, early instructions lose influence, and the agent's behaviour degrades. These failures are not mysterious; they are the predictable consequence of giving an underspecified task to a system that has no judgment, only prediction. The remedy is specific goals, constrained tools, and human checkpoints.

Designing agents that work

An effective agent has a narrow, well-defined task. It has access to exactly the tools it needs and no more. Its instructions specify not just what to do but when to stop, when to ask for help, and what to do when it encounters ambiguity. It operates within an environment where its actions are logged and reversible. A human reviews its output before anything consequential happens. Designed this way, an agent is a powerful component in a larger system. Designed poorly, with broad access, vague goals, and no oversight, it is a liability. The difference is entirely in the design, and the designer is you.

Examples

Research agent vs. single search

You need to understand the current state of a niche regulation. A single search returns ten results, most of which are outdated or tangential. An agent searches, reads the top results, identifies that the regulation was amended in 2025, searches specifically for the amendment, finds the official text, reads it, and produces a summary with citations to the current version. The value is not in any single search; it is in the adaptive refinement of the query based on what each step reveals.

The wandering agent

You ask an agent to "improve this document". The agent rewrites the introduction, then reformats the headings, then adds a table of contents, then starts researching additional sources, then begins rewriting sections based on those sources. Two hours later, the document is longer, differently structured, and no longer matches your intent. The instruction was too vague. A specific instruction ("reduce the executive summary to 200 words, fix the three grammatical errors I flagged, and update the statistics in section 4 to the 2025 figures") would have produced a useful result in minutes.