Instrument Specifications (as of Jan 2026)

Key metrics that define the Instrument

Every Instrument is defined by a set of metrics. At a high level:

  • Performance / Intelligence

    A score on an independent benchmark such as an Artificial Analysis Intelligence Index that reflects reasoning ability and accuracy.

  • Time to First Token (TTFT)

    Maximum allowed time from request to first streamed token. This is critical for interactive UX.

  • Throughput

    Minimum tokens per second while streaming. This drives both UX smoothness and batch job completion times.

  • Context Window

    Minimum number of tokens the model can accept as input. This determines how much content you can pass in a single call.

  • Output Length

    Minimum number of tokens the model can generate in a single response for that Instrument.

  • Uptime and Error Rate

    Reliability requirements that suppliers must meet to stay eligible.

Current Specifications

As of January 2026, Chat Prime and Chat Fast are available are served as per below specification. These values may be updated over time.

Instrument

Chat Prime

Performance Score (Intelligence Index by ArtificialAnalysis)

≥58

Throughput (Tokens/Sec)

≥50

Time to First Token (Seconds)

≤3.234

Context Window

≥128K

Output Length

≥64K

Qualifying Models

GPT-5 (high), o3-pro, Grok 4, GPT-5 (medium), o3, o4-mini (high), Gemini 2.5 Pro, GPT-5 mini, Qwen3 235B A22B 2507 (Reasoning), GPT-5 (low), Claude 4.1 Opus (Extended Thinking), Gemini 2.5 Pro Preview (Mar '25), Claude 4 Sonnet (Extended Thinking), DeepSeek R1 0528 (May '25), Gemini 2.5 Flash (Reasoning), gpt-oss-120B

Instrument

Chat Fast

Performance Score (Intelligence Index by ArtificialAnalysis)

≥25

Throughput (Tokens/Sec)

≥160

Time to First Token (Seconds)

≤0.479

Context Window

≥128K

Output Length

≥32K

Qualifying Models

Grok 3, GPT-4o (March 2025, chatgpt-4o-latest), Mistral Medium 3, Gemini 2.0 Pro Experimental (Feb '25), DeepSeek R1 Distill Qwen 14B, Sonar Reasoning, Gemini 2.5 Flash Preview, Gemini 2.0 Flash (Feb '25), Magistral Medium, DeepSeek R1 Distill Llama 70B, Claude 3.7 Sonnet (Standard), Qwen3 4B (Reasoning), Reka Flash 3, Magistral Small, Gemini 2.0 Flash (experimental), Nova Premier, Gemini 2.5 Flash-Lite, DeepSeek V3 (Dec '24), Qwen2.5 Max, Llama 3.1 Nemotron Nano 4B v1.1 (Reasoning), Solar Pro 2 (Preview), Gemini 1.5 Pro (Sep '24), Claude 3.5 Sonnet (Oct '24), Qwen3 Coder 30B A3B, Qwen3 235B A22B, Solar Pro 2, Llama 4 Scout, Sonar, Mistral Small 3.2, Sonar Pro, Command A, QwQ 32B-Preview, Devstral Medium, Llama 3.3 Instruct 70B, Gemini 2.0 Flash-Lite (Feb '25), Qwen3 30B A3B, GPT-4.1 nano, Qwen3 14B, Qwen3 32B, GPT-4o (May '24), Gemini 2.0 Flash-Lite (Preview), GPT-4o (Nov '24), GPT-4o (Aug '24), Llama 3.1 Instruct 405B, Qwen2.5 Instruct 72B, MiniMax-Text-01, Claude 3.5 Sonnet (June '24), Nova Pro, Llama 3.1 Tulu3 405B, GPT-4o (ChatGPT), Llama 3.3 Nemotron Super 49B v1, Grok 2 (Dec '24), Phi-4, Gemini 1.5 Flash (Sep '24), GPT-4 Turbo, Mistral Large 2 (Nov '24), Qwen3 1.7B (Reasoning), Llama Nemotron Super 49B v1.5, Grok Beta, Pixtral Large, Qwen2.5 Instruct 32B, Llama 3.1 Nemotron Instruct 70B, Qwen3 8B, Gemma 3 27B Instruct, Mistral Large 2 (Jul '24), Qwen2.5 Coder Instruct 32B

The list of qualifying models for each Instrument evolves constantly. In practice, The Grid maintains a rotating roster of frontier and near frontier models from multiple providers.

Last updated

Was this helpful?