Getting started

Model access is the first architecture decision.

Before tools, agents, protocols, and hooks matter, you need a way to talk to a model. That can mean a subscription product, an API key, a managed model platform, an aggregator, a local server, or a downloaded model artifact.

Access boundary

Do not collapse all model access into "the model."

A chat subscription, a provider API, a managed platform, a model router, a local runtime, and a model file are different things. They can all lead to a text response, but they create different costs, privacy boundaries, portability constraints, and integration surfaces.

The useful question is not just "which model is best?" It is "where does the model run, how do I call it, and what else comes bundled with that access path?"

Access ladder

The layers people often mix together

Hosted subscriptions

A human-facing product gives you a chat, IDE, CLI, or assistant surface. You are usually buying the experience, account layer, usage policy, and product workflow more than direct model control.

Examples: ChatGPT, Claude, GitHub Copilot, Gemini, Cursor.

Direct provider APIs

A hosted provider gives you programmatic access through API keys, SDKs, rate limits, model names, and usage billing. This is the most common starting point for building your own tools around a model.

Examples: OpenAI, Anthropic, Google, Mistral, Cohere, xAI.

Need credential handling guidance? See API key security.

Aggregate providers and routers

A broker exposes one interface across many model providers. That can make experimentation easier, but it adds another trust, pricing, and routing boundary to understand.

Examples: OpenRouter, LiteLLM-style gateways, provider comparison layers.

Managed model platforms

A managed platform can blend model catalogs, deployment surfaces, governance, enterprise identity, and cloud-native operations. It is not just a provider API and not just a router.

Examples: Azure AI Foundry, Amazon Bedrock, Vertex AI-style managed AI platforms.

See also: Managed model platforms.

Local hosting software

A local runtime downloads or loads a model and exposes it through a desktop app, CLI, or local API endpoint. The endpoint may look like a hosted API, but the operational tradeoffs are yours.

Examples: Ollama, LM Studio, llama.cpp servers, vLLM-style inference servers.

Need the split? Local hosting and model artifacts.

Need the machine-fit side? Local hardware and runtime fit.

Local model artifacts

The artifact is the actual model checkpoint, weights, or quantized file. It has its own license, size, architecture, context window, hardware needs, and fit for chat, coding, embeddings, or tool use.

Examples: small instruct models, coding models, embedding models, quantized GGUF files.

See also: which part does what.

Client surfaces

A client is what you actually use: chat UI, CLI, SDK, wrapper, IDE extension, agent host, or notebook. It may hide whether the backing model is hosted, local, direct, or routed.

Examples: SDK calls, terminal chat, IDE assistants, local agent hosts.

Common confusion

A subscription is not the same thing as API access.

Access path What you usually get What it is good for Where tooling friction appears
Subscription product App, chat, IDE, account features, usage limits, saved history, product-specific tools. Human workflows, writing, coding assistance, product-integrated context, team adoption. Automation may be limited to product-supported extension points.
Provider API Programmatic endpoint, SDKs, API keys, usage billing, model/version selection. Custom CLIs, agents, evals, internal tools, repeatable workflows, backend services. You own auth handling, retries, logging, cost controls, and safety boundaries.
Local endpoint Runtime process, downloaded model, local address, hardware-dependent performance. Offline experiments, privacy-sensitive prototypes, learning, model swapping, low-cost iteration. You own installation, updates, speed, memory use, and model compatibility.

Starting paths

Choose the first path that matches what you actually want.

Most beginners do not need every option at once. Pick the path that matches your immediate goal, then add complexity later.

If you already have a hosted CLI agent product but no API key, that still counts as a model surface. Treat it like a subscription-based host, then use the labs bootstrap step to decide where to jump in.

If you are entering through an enterprise AI cloud, see managed model platforms before choosing where to jump into the labs.

If you are about to put a provider key into a shell or tool host, read API key security first.

Match the path to the goal

Start from the constraint, not from the hype.

If you want... Start with... Because...
The fastest path to using an assistant Subscription product A polished product removes setup and lets you learn the workflow first.
A hosted CLI agent surface, but no API key Subscription product plus CLI host You already have a usable model-facing interface, even if you do not control a raw API. For the labs, treat that surface as your starting point and skip ahead accordingly.
A scriptable foundation for custom tools or agents Direct provider API You get a stable programmatic surface you can wrap, log, test, and automate.
An enterprise cloud boundary with deployment and policy controls Managed model platform You may need model access plus deployment, org policy, identity, and managed evaluation features in one place.
Easy model comparison across providers Aggregate provider or router You can hold prompts and evals constant while changing the backing model.
Offline experiments or maximum local control Local hosting runtime plus a small model artifact You own the runtime and endpoint, but you also own the setup and performance tradeoffs.

Decision handles

The right access path depends on the constraint.

Privacy boundary

Where do prompts, files, outputs, logs, and embeddings go?

Cost model

Are you paying per seat, per token, through a platform or router, or through local hardware?

Latency and reliability

Is the bottleneck network, provider queueing, local CPU/GPU speed, or model size?

Capability fit

Does the model handle chat, code, tool calls, long context, embeddings, or structured output well enough?

Portability

Can you swap models without rewriting your prompts, tool schemas, evals, client code, or platform-specific deployment assumptions?

License and terms

What are you allowed to run, modify, redistribute, log, fine-tune, or use commercially?

Local-first path

If you want the full stack, start from one tiny local runtime.

The current labs use a toy model interface so the tooling boundaries stay easy to see. A deeper track can start from the real beginning: pull a small open model, host it locally, expose an endpoint, and then rebuild the same tooling layers on top.

1

Choose the model artifact. Check license, size, hardware needs, context length, and task fit before downloading anything.

2

Run a local host. Use a runtime that can expose a local endpoint, then prove a simple prompt works.

3

Wrap the endpoint. Build the smallest CLI around that endpoint, then add tools, JSON, protocols, hooks, memory, and evals.

Next move

Turn the chosen surface into one boring interface.

1

Pick the access path. Subscription, API, managed platform, router, or local host.

2

Give it a stable interface. Turn it into one boring command, request, or CLI surface that always accepts the same kind of input.

3

Then build upward. Continue into the labs, the stack, and the protocols page once the model surface is clear.