# How specifications evolve

## Why specifications get revised

The frontier moves. New models clear higher Quality Scores, serving stacks get faster, context windows get longer. If thresholds stayed fixed, every tier would slowly drift toward irrelevance and Standard would quietly become what Prime is today. Revisions keep each tier meaningful: Max means frontier, Prime means strong production-grade, Standard means fast and cheap but still useful.

For what each threshold actually measures, see [How we define instruments](broken://pages/sfxozGnwuVs8kBcTbFhO). For the live values, see [Current instruments](/docs/instrument-specifications/current-instruments.md). This page is about the schedule and governance around changes.

## Review cadence is quarterly

Every quarter, we revisit each instrument specification:

1. **Refresh benchmark data.** At the end of the quarter, scores across all eligible models and suppliers get pulled from the underlying public leaderboards.
2. **Propose new thresholds.** Quality Score floors, throughput minima, TTFT cutoffs, and task-type-specific benchmark thresholds get re-evaluated.
3. **Validate eligibility.** Before any change publishes, proposed thresholds get tested against the live model set. A reasonable number of distinct models from distinct suppliers still qualify for each instrument. A specification that no model can hit is not a specification, it is a moratorium.
4. **Publish with notice.** Changes publish at least 15 days before the effective date.
5. **Effective date.** On the first day of the new quarter at 00:00 UTC, the updated specification goes live.

Quarterly is long enough that suppliers can plan upgrades without thrash, and short enough that thresholds do not drift far behind the frontier. Three months is the cleanest cadence in commodity markets that benchmark physical or technical specifications (energy, agriculture, metals), and the same logic applies here.

## Supplier rollover keeps the order book filling

Specification changes publish before they take effect, never retroactively. Existing customers see no break in API contracts. The instrument string stays the same. What changes is the supplier mix underneath.

Suppliers whose models no longer meet the revised thresholds get a rollover window:

* **Re-qualify.** Roll over to a model that clears the new specification.
* **Exit.** Stop offering supply on that instrument.
* **Suspend.** Suppliers who do not attest to the new specification are suspended until they comply.

Suppliers who attest on schedule keep their inventory active. The order book continues to fill orders without interruption while the supplier set adjusts.

## Your code does not change when specifications get revised

The instrument string is stable across revisions. When a Quality Score floor goes up, the models serving the instrument get stronger. When older models fall below the revised threshold, they get removed from the eligible set. You write `model: "text-prime"` and ship.

A few practical implications:

* **No migration on revision day.** No new credentials, no new endpoints, no SDK changes.
* **Supplier composition shifts silently.** The model behind a request can change between calls, both within a quarter and across revisions.
* **Notifications go to suppliers, not consumers.** As a buyer, you do not have to track the cycle to keep your code working.

## Code and Agent specifications are still calibrating

Code and Agent instruments are in Preview. Their thresholds are subject to revision before the instruments go live as the underlying benchmarks (SWE-bench Pro, METR, tau-squared-Bench) get refined and as the model landscape shifts. The values on [Current instruments](/docs/instrument-specifications/current-instruments.md) reflect the launch specification. Once Code and Agent go live, they enter the same quarterly cycle as Text.

## Next

* [Current instruments](/docs/instrument-specifications/current-instruments.md). Live catalog with thresholds and when-to-use guidance.
* [How we define instruments](broken://pages/sfxozGnwuVs8kBcTbFhO). What the Quality Score, context, and latency thresholds mean and how qualification works.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://thegrid.ai/docs/instrument-specifications/how-specifications-evolve.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
