# Choose an instrument

If you only read one line: start on Prime, reach up to Max for hard tasks, drop to Standard for high-volume work. Most teams never need anything else for the first month.

## The three tiers

We publish each task type in three tiers. Tiers differ in their Quality Score floor and in the cost, latency, and context envelope suppliers fill at.

* **Prime** is the production default. Strong reasoning, large context, priced to let you run agents and pipelines without watching the meter.
* **Max** is for high-stakes work where errors compound: long-context synthesis, cross-repo code reasoning, long-horizon agent runs, regulated output.
* **Standard** is for cheap, high-volume work: classification, autocomplete, triage, batch pipelines.

Each tier's Quality Score floor is anchored to the relevant [Artificial Analysis](https://artificialanalysis.ai/) leaderboard: Intelligence Index for Text, Coding Index for Code, Agentic Index for Agent. For the current live and preview list with the actual floors, see [Current instruments](/docs/instrument-specifications/current-instruments.md).

## Pick a tier for your workload

### Text

| If your workload looks like this                                      | Use                                          |
| --------------------------------------------------------------------- | -------------------------------------------- |
| Agents, RAG, code generation, drafting, quality-sensitive work        | `text-prime`                                 |
| Very large context, multi-step reasoning, high cost of errors         | `text-max`                                   |
| Classification, chatbots, batch jobs, summarization, high-volume work | `text-standard`                              |
| Not sure yet                                                          | Start on `text-prime`, move up or down later |

### Code (preview)

| If your workload looks like this                                      | Use                                          |
| --------------------------------------------------------------------- | -------------------------------------------- |
| Daily coding, code review, standard debugging, feature implementation | `code-prime`                                 |
| Full codebase analysis, cross-repo migrations, multi-layer debugging  | `code-max`                                   |
| Autocomplete, linting, inline suggestions, batch edits                | `code-standard`                              |
| Not sure yet                                                          | Start on `code-prime`, move up or down later |

### Agent (preview)

| If your workload looks like this                                   | Use                                           |
| ------------------------------------------------------------------ | --------------------------------------------- |
| Production agent loops, multi-step tool use, reasoning workflows   | `agent-prime`                                 |
| Autonomous research, 50+ tool chains, long-horizon tasks (hours)   | `agent-max`                                   |
| Fast tool calls, simple agent loops, high-throughput orchestration | `agent-standard`                              |
| Not sure yet                                                       | Start on `agent-prime`, move up or down later |

## Default to Prime

Most production traffic on The Grid runs through Prime. If you're starting fresh and not sure where to land, start there. Reach for Prime when:

* You're running an agent loop with 10 to 200 inference calls per task.
* You're building a RAG app where retrieval does the heavy lifting and the model just synthesizes.
* You're generating code that has to compile and pass tests.
* You're producing content from a brief (drafts, structured output, marketing copy, replies).

## Reach for Max only when errors compound

Max tiers are for work where the cost of getting it wrong is much larger than the cost of the inference call. Examples:

* Long-context synthesis where you're packing hundreds of thousands of tokens into one prompt and the answer depends on connecting facts across all of it.
* Architecture-level code reasoning, cross-repo migrations, or debugging across multiple services and runtimes.
* Autonomous research or long-horizon agent runs that take hours and feed downstream work.
* High-stakes professional output (legal, medical, financial) where a wrong answer has real consequences.

Route up by exception. If most of your calls go to Max, you're probably overpaying.

## Drop to Standard for high-volume work

Standard tiers handle the high-volume, low-ambiguity work that makes up most of an application's traffic. Use Standard when:

* You're classifying, tagging, or routing inbound data at scale.
* First token has to arrive in under a second so the user feels the response is instant.
* You're running batch pipelines where throughput and unit economics matter more than reasoning depth.
* You're doing autocomplete, inline suggestions, or formatting work that fires on every keystroke.

A single app routing 80% of traffic to Standard and 20% to Prime is usually where the savings show up.

## Mix tiers across one app

Most of the savings come from splitting one application across multiple instruments rather than sending everything to one tier.

* Triage on Standard, reason on Prime. Classify inbound requests on `text-standard`, send the ones that need real reasoning to `text-prime`.
* Retrieve on Standard, synthesize on Prime. Run retrieval and reranking on Standard, generate the final answer on Prime.
* Code Standard for autocomplete, Code Prime for implementation.
* Agent Standard for orchestration, Agent Prime for the calls where the agent actually has to reason.
* Mix task types. A coding agent can use `code-prime` for writing code, `agent-prime` for planning, and `text-standard` for summarizing. Each instrument is independent.

## Where to go next

* [Quickstart](/docs/start-here/quickstart.md), make your first call on Text Prime.
* [Current instruments](/docs/instrument-specifications/current-instruments.md), the full live and preview list with Quality Score floors and per-task-type [Artificial Analysis](https://artificialanalysis.ai/) anchors.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://thegrid.ai/docs/start-here/choose-an-instrument.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
