Hands on

Playground experiments

Use this page for extra experiments and variations. For the step-by-step build path, start with the dedicated labs.

Experiment arc

Build one useful capability several ways.

The best way to understand the ecosystem is to pick a small task and move it up the stack. The labs give that path a clear sequence. This playground is where you try alternate tools, compare tradeoffs, and invent your own small experiments.

If an experiment needs a real provider key, read API key security before you start improvising with shell variables or local configs.

Build path

From model access to agent workflow

This is the short version of the lab arc. The full version starts with model access, then adds tooling, structured boundaries, memory, coordination, a host-like CLI, and governance.

1. Wrap model access

Start with an API key, endpoint, or local address, then create a boring command that sends a prompt and returns a visible result.

What we learn

Before tools and agents matter, model access needs a repeatable interface.

2. Add a deterministic tool

Create a small command that accepts flags, emits JSON, returns meaningful exit codes, and supports --dry-run.

What we learn

Agents work better with tools that are explicit, inspectable, and easy to validate.

3. Add a protocol boundary

Expose the same behavior as a typed tool with a schema and structured result. MCP is the real protocol to compare against.

What we learn

Protocols make capabilities discoverable and portable across hosts.

4. Add hooks

Run checks before dangerous inputs, after generated files, or before committing outputs.

What we learn

Safety and consistency should not depend only on the model remembering rules.

5. Compare hosts

Try the same task through direct shell, a skill-guided agent, a protocol adapter, and a CLI AI wrapper.

What we learn

The same underlying tool can feel very different depending on UX, approvals, and context policy.

Good first projects

Small enough to build, rich enough to teach.

Doc indexer

Read local docs, produce a JSON index, and answer "where is this concept explained?"

Repo health checker

Inspect a workspace for package files, tests, git status, TODOs, and missing docs.

Glossary builder

Extract terms from docs, detect undefined jargon, and suggest plain English definitions.

Experiment checks

A small capability should leave a clean trail

Criterion Question
Determinism Can the capability be run twice with predictable results?
Discoverability Can an agent or human understand what the tool does without reading the source?
Safety Are dangerous actions explicit, gated, or dry-run by default?
Observability Can we see what happened, what inputs were used, and why it failed?
Portability Can the same capability be reused by another host or agent?

Walkthrough

The doc indexer experiment

1

CLI: doc-index docs/ --json returns pages, headings, and terms.

2

Skill: instructions tell the agent when to rebuild the index and how to use it while answering questions.

3

MCP: the index becomes a resource the host can query without knowing the CLI details.

4

Hook: the index refreshes after docs change, so stale context is easier to catch.