Overview

1 What is an AI agent?

This chapter introduces the modern landscape of AI agents and sets the philosophy of the book: understand and build agents from first principles before relying on frameworks. It surveys how agents show up in practice—from personal assistants and customer-facing systems to specialized coding and research tools—and argues that Large Language Models (LLMs) power nearly all of them. The authors position agent building as an exercise in debugging: to fix failures, you must know how the parts work. They also preview key themes that guide the rest of the book: LLMs as the agent’s “brain,” the distinction between workflows and agents, the GAIA benchmark for measuring progress, and the centrality of context engineering.

The core definition of an agent is LLM + tools + loop: the model decides what to do next, invokes external tools (search, code execution, databases), ingests results back into its context, and iterates until it chooses to stop. This autonomy distinguishes agents from plain LLM calls and from traditional, developer-defined workflows. The chapter maps a spectrum from predictable workflows (single calls, chains, routers) to agentic systems that direct their own multi-step processes and can even write new tools. It offers practical guidance on when agents are warranted—tasks with unstructured inputs, high input diversity, and uncertain step counts—while underscoring trade-offs: higher cost, latency, and error propagation. In production, hybrid designs often work best, embedding agents inside workflow stages for controlled flexibility, cost management, and safer failure handling.

To evaluate agent capabilities, the chapter adopts GAIA, a benchmark of multi-step, real-world questions that demand reasoning, retrieval, and calculation—ideal for iterating on agent designs and quantifying improvements. It then broadens prompt engineering into context engineering: the discipline of curating everything the model sees—system instructions, conversation state, tool outputs, and retrieved knowledge—at the right time and granularity. Most real failures come from missing information rather than insufficient model intelligence, and larger contexts can degrade performance, so relevance and focus matter. The chapter outlines five strategies—Generation, Retrieval, Write, Reduce, and Isolate—that will be layered through the book’s implementation roadmap, alongside practical prerequisites (Python, environment setup, API keys, and cost awareness) to equip readers to build, measure, and iteratively improve agents from scratch.

Example of a language model’s generalization capability.
User requests flow through the research agent, which branches into multiple searches and synthesis.
The LLM Agent's decision loop is an iterative process of LLM decision-making and tool use.
Progression of agency levels in LLM applications.
LLMs can only produce accurate, high-quality responses when sufficient information is provided in the context.
Even with large context windows, longer inputs can degrade model performance(Source: https://research.trychroma.com/context-rot).
An overview of the journey through the book

Summary

  • AI agents span a wide spectrum, from personal assistants like ChatGPT and Claude to customer-facing agents and specialized tools like Claude Code and Cursor. All share a common foundation: LLMs as their decision-making core.
  • An LLM agent consists of three elements: the LLM (brain), tools (means of interacting with the external world), and a loop (iterative process until goal completion). The LLM decides which tool to use and when to stop.
  • Workflows are developer-defined execution flows where LLMs perform specific steps. Agents are LLM-directed flows where the model dynamically determines its own process. Production systems often combine both approaches.
  • Use agents when tasks require multiple unpredictable steps, provide sufficient value to justify costs, and allow for error detection. The GAIA benchmark provides ideal practice problems for agent development.
  • Context engineering is the discipline of providing the right information at the right time. Five strategies (Generation, Retrieval, Write, Reduce, Isolate) form the framework for building effective agents throughout this book.

FAQ

What is an AI (LLM) agent?An LLM agent is a program that uses a Large Language Model as its decision-making core, interacts with the external world through tools, and operates in a loop until a goal is achieved. In short: LLM + tools + loop. The model decides which action to take next and when to stop based on the current context.
How do LLMs enable agent behavior if they only predict the next token?LLMs use their generalization and reasoning abilities to choose actions, not just produce text. When paired with tools (search, code execution, APIs) and an iterative loop (reason, act, observe), the model can plan multi-step tasks, gather missing information, and decide when the task is complete.
What types of AI agents are common today?Three broad types: 1) Personal agents (general-purpose assistants that adapt to many tasks). 2) Customer-facing agents (business-aligned assistants that follow policies and handle transactions). 3) Specialized agents (domain tools like coding or deep research agents that often run asynchronously).
How does the agent loop work?The loop repeats: 1) The LLM evaluates the context and decides if a tool is needed. 2) A selected tool is executed. 3) Tool results are added back into the context. 4) The LLM decides to continue or stop. This supports tasks with unpredictable numbers of steps.
How are agents different from simple LLM calls and traditional workflows?Workflows are developer-defined and predictable (single call, chains, routers). Agents are LLM-directed: they choose actions and tools dynamically and iterate until done. Use workflows for structure and reliability; use agents where flexibility and open-ended problem solving are required.
How do I decide if a task needs an LLM at all?Prefer an LLM when: 1) The task involves unstructured data (text, images, audio) needing flexible interpretation. 2) Inputs and requests are diverse and hard to predefine. If the task is deterministic over structured data, traditional code or a narrow model is cheaper and more reliable.
When should I use an agent instead of a single call or workflow?Consider: 1) Task complexity (unknown number of steps or paths). 2) Task value (benefit outweighs extra cost/latency of multi-call loops). 3) Error cost and detectability (can you catch or tolerate mistakes?). Agents trade higher cost/latency for flexibility.
What is the GAIA benchmark and why use it?GAIA is a dataset of questions that require multi-step reasoning, web search, and calculations—ideal for agent evaluation. It offers clear answers for fast feedback, the right difficulty for iterative development, and minimal domain knowledge requirements.
What is context engineering, and why do agents fail without it?Context engineering is the practice of providing the right information, at the right time, in the right form to the LLM (prompts, history, tool results, retrieved docs, etc.). Agents often fail not from lack of intelligence but from missing information in context. Good context design improves accuracy and reliability.
Is a bigger context always better? What strategies help?No. Longer contexts can degrade performance (context rot, lost-in-the-middle). Focus on relevance using five strategies: 1) Generation (plans, reflections). 2) Retrieval (bring in needed info). 3) Write (persist to memory/workspace). 4) Reduce (summarize/filter). 5) Isolate (separate tasks/tools or agents).

pro $24.99 per month

  • access to all Manning books, MEAPs, liveVideos, liveProjects, and audiobooks!
  • choose one free eBook per month to keep
  • exclusive 50% discount on all purchases
  • renews monthly, pause or cancel renewal anytime

lite $19.99 per month

  • access to all Manning books, including MEAPs!

team

5, 10 or 20 seats+ for your team - learn more


choose your plan

team

monthly
annual
$49.99
$499.99
only $41.67 per month
  • five seats for your team
  • access to all Manning books, MEAPs, liveVideos, liveProjects, and audiobooks!
  • choose another free product every time you renew
  • choose twelve free products per year
  • exclusive 50% discount on all purchases
  • renews monthly, pause or cancel renewal anytime
  • renews annually, pause or cancel renewal anytime
  • Build an AI Agent (From Scratch) ebook for free
choose your plan

team

monthly
annual
$49.99
$499.99
only $41.67 per month
  • five seats for your team
  • access to all Manning books, MEAPs, liveVideos, liveProjects, and audiobooks!
  • choose another free product every time you renew
  • choose twelve free products per year
  • exclusive 50% discount on all purchases
  • renews monthly, pause or cancel renewal anytime
  • renews annually, pause or cancel renewal anytime
  • Build an AI Agent (From Scratch) ebook for free