Instrument Specifications (latest)

On The Grid, every Instrument is defined by a specification: a set of measurable thresholds that all supplier models must meet or exceed. These thresholds cover intelligence (scored on the Artificial Analysis Intelligence Index), time to first token, streaming throughput, context window size, maximum output length, and reliability (uptime and error rate). The specification is what makes Units fungible across suppliers - any model that passes the bar can deliver the Instrument.

Key metrics that define the Instrument

Every Instrument is defined by a set of metrics. At a high level:

  • Quality

    A score on an independent benchmark such as an Artificial Analysis Intelligence Index that reflects ability to fulfill a buyers' requirements.

  • Throughput

    Minimum tokens per second while streaming. This drives both UX smoothness and batch job completion times.

  • Latency (Time to First Token)

    Maximum allowed time from request to first streamed token. This is critical for interactive UX.

  • Context Window

    Minimum number of tokens the model can accept as input. This determines how much content you can pass in a single call.

  • Output Length

    Minimum number of tokens the model can generate in a single response for that Instrument.

Current Specifications

As of February 2026, Text Prime and Text Standard are available are served as per below specification. These values may be updated over time.

Instrument

Text Prime

Unit

1M tokens

Throughput

≥40 tokens/sec

Latency (Time to First Token)

≤4.62 sec

Context Window

≥128K tokens

Output Length

≥ 64K tokens

Instrument

Text Standard

Unit

1M tokens

Throughput

≥100 tokens/sec

Latency (Time to First Token)

≤1.32 sec

Context Window

≥128K tokens

Output Length

≥ 64K tokens

The list of qualifying models for each Instrument evolves constantly. In practice, The Grid maintains a rotating roster and regularly evaluates participating suppliers to ensure conformance.

Last updated

Was this helpful?