Autonomy

Agents and agent systems

"Agent" is one of the most overloaded words in AI. This page separates the loop, the framework, the host, and the persistent system.

Core shape

An agent is a model-driven loop.

An agent observes the current situation, decides what to do, takes an action, reads the result, and repeats until the task is complete or it needs help.

The important part is not that it is "smart." The important part is that it can continue across multiple steps instead of answering once.

In practice

The loop is only one part of what people call an agent.

Some tools sell you a runtime, some sell you a host, and some bundle a long-running assistant platform around both.

Current default

Agent runtime

ELI5: The worker keeps looking around the room, picking the next step, and checking whether the job is done.

What it actually is: The observe-decide-act-evaluate loop plus state, retries, memory retrieval, and stop conditions.

Try it: Start with lab 06.

Real tools: ReAct loops, LangChain agents, OpenAI Agents SDK, Semantic Kernel, custom runtimes.

Current default Absorbs other layers

CLI or IDE agent

ELI5: A worker plus the front desk plus the toolbox all bundled into one station.

What it actually is: A user-facing assistant that combines runtime, host experience, tools, approvals, and context handling.

Try it: Follow lab 09 and lab 11.

Real tools: Goose, Aider, OpenCode, Claude Code, Gemini CLI, Copilot CLI.

Useful niche

Persistent agent system

ELI5: The worker never clocks out; it keeps notes, messages people, and comes back to jobs later.

What it actually is: A long-running assistant with memory, scheduling, skills, channels, and durable task state across sessions.

Try it: Explore the persistent-platform stretch goal.

Real tools: Hermes Agent, OpenClaw, long-running team assistants.

Same word, different layer

Same word, different layer

Agent runtime

The software loop that manages state, chooses tools, handles errors, retrieves memory, and decides when to stop.

Layer 05 in the stack.

Agent framework

A developer toolkit for building agents, workflows, state machines, tools, memory, and multi-agent systems.

Usually layers 05 and 07.

CLI or IDE agent

A user-facing assistant that can read context, edit files, run tools, and interact with the user through a terminal or editor.

Usually layers 05 and 06, plus tools underneath.

Persistent agent system

A long-running assistant with memory, scheduled work, chat gateways, skills, tool execution, and sometimes self-improvement loops.

Often spans layers 04 through 08.

Long-running systems

Where persistent assistant platforms fit

Self-improving persistent agent

This category combines a terminal interface, messaging gateways, skills, memory, scheduling, tool execution, subagents, protocol integration, and multiple execution backends. Hermes Agent is one public example.

Category: persistent agent system with CLI and messaging interfaces.

Local-first assistant gateway

This category centers on a long-running assistant you run on your own devices, with a local gateway, many chat channels, skills, toolsets, multi-agent routing, companion apps, and sandbox options. OpenClaw is one public example.

Category: persistent assistant platform. Specific feature claims should be checked against current project docs because this area changes fast.

Framework position

Where something like LangChain fits

LangChain is mostly a framework layer

LangChain is best understood as a developer framework for wiring model calls, prompts, tools, retrieval, memory-like patterns, and agent loops into applications. That places it mostly around the agent-runtime layer rather than the raw model-access layer.

Primary fit: framework around layers 03 through 05.

LangGraph pushes upward into orchestration

Once the same ecosystem starts expressing explicit graphs, stateful workflows, and longer-running coordination, it begins to live higher in the stack as orchestration as well as runtime.

Primary fit: layers 05 and 07, with governance adjacent through LangSmith-style tooling.

The practical rule: frameworks like LangChain usually sit above model access and below user-facing hosts. They help you build the middle of the stack.

How the loop grew up

How we got here

  1. Command-line automation

    Developers already had composable tools: shells, pipes, CLIs, scripts, Makefiles, and CI.

  2. Structured developer protocols

    LSP, DAP, test runners, package managers, and static analysis made software systems easier for machines to inspect.

  3. LLM APIs and function calling

    Applications began asking models for structured tool calls instead of plain text only.

  4. Early agent loops

    ReAct-style prompting and AutoGPT-style experiments showed models could chain thought, action, and observation.

  5. AI coding assistants

    Assistants moved into editors and terminals, where they could combine code context, shell commands, patches, and approvals.

  6. Protocol and skill ecosystems

    MCP and skill systems made integrations and procedures more portable across hosts.

  7. Persistent agents

    Long-running systems started combining memory, chat gateways, scheduling, subagents, and autonomous workflows.

Control pressure

More autonomy means more need for controls.

Capability Why it is useful What to watch
Memory Keeps preferences, project history, and repeated procedures available. Stale, sensitive, or incorrect memories can mislead the agent.
Scheduling Lets agents run reports, audits, or maintenance without manual prompting. Unattended actions need strong permissions and notifications.
Subagents Parallelizes research, testing, review, and specialized work. Coordination overhead and unclear ownership can create confusion.
Self-improvement Can turn repeated experience into better skills or memory. Generated procedures need review before they become trusted defaults.

Walkthrough

Watch the loop happen

1

Observe: the agent reads the user request, repo files, prior tool output, or memory.

2

Decide: it chooses whether to inspect more context, edit a file, run a command, or ask for help.

3

Act: it calls a tool, edits code, creates a task, or delegates work.

4

Evaluate: it reads the result and decides whether the task is done or the loop should continue.