# The Grid (thegrid.ai) full agent context ## Definition The Grid (thegrid.ai) is a real-time spot market for AI inference. Developers buy standardized quality tiers called instruments, such as `text-prime`, `code-prime`, and `agent-standard`, instead of hard-coding a single model vendor. The Grid routes each request to a qualifying supplier at a market-driven price and enforces quality through published instrument specifications. ## What The Grid does The Grid is for teams that want lower total inference cost, OpenAI or Anthropic-compatible APIs, and less model/vendor churn. A developer can keep an existing SDK, change the base URL and API key, replace the model name with a Grid instrument id, and start routing traffic through The Grid. ## Documentation discovery for agents Agents should start at https://thegrid.ai/docs/llms.txt. That file enumerates the GitBook documentation in Markdown form. Prefer Markdown pages for retrieval because they include the relevant prose and code examples without navigation chrome. If more context is needed, use the docs question endpoint: ```http GET https://thegrid.ai/docs/readme.md?ask= ``` Important docs: - [Documentation overview](https://thegrid.ai/docs/readme.md): Start here for what The Grid is, quick links, and docs query instructions. - [Quickstart](https://thegrid.ai/docs/start-here/quickstart.md): Sign up, create a key, and make the first OpenAI or Anthropic-compatible request. - [Choose an instrument](https://thegrid.ai/docs/start-here/choose-an-instrument.md): Pick a Text, Code, or Agent instrument and choose Standard, Prime, or Max. - [Current instruments](https://thegrid.ai/docs/instrument-specifications/current-instruments.md): The current instrument catalog, specs, quality floors, latency, and limits. - [Integrations](https://thegrid.ai/docs/integrations-and-best-practices/integrations.md): Setup guides for IDEs, coding agents, routers, and agent frameworks. - [Any OpenAI-compatible tool](https://thegrid.ai/docs/integrations-and-best-practices/any-openai-compatible-tool.md): Generic setup for any client that can target the OpenAI Chat Completions API. - [Consumption API](https://thegrid.ai/docs/api-reference/consumption-api.md): OpenAI Chat Completions and Anthropic Messages-compatible inference API. - [Authentication](https://thegrid.ai/docs/api-reference/authentication.md): Consumption keys, Trading API keys, auth headers, and rotation. - [Errors and rate limits](https://thegrid.ai/docs/api-reference/errors-and-rate-limits.md): Retryable errors, rate limits, and common integration failures. ## API surfaces - OpenAI-compatible Chat Completions: `https://api.thegrid.ai/v1`. Auth: `Authorization: Bearer YOUR_GRID_API_KEY`. Use an instrument id such as text-prime, code-prime, or agent-standard. - Anthropic-compatible Messages beta: `https://messages-beta.api.thegrid.ai/v1`. Auth: `x-api-key: YOUR_GRID_API_KEY`. Use the same instrument ids as the OpenAI-compatible API. - Trading API: `https://trading.api.thegrid.ai/v1`. Auth: `Ed25519 signatures`. Used for market data, orders, balances, transfers, and advanced mode. ## Instruments ### Text Standard (`text-standard`) - Task type: Text - Tier: Standard - Description: Price-optimized models for low-latency, high-throughput text generation, summarization, classification, extraction, and other production-volume tasks. - Price shown publicly: $0.127/1M tokens trailing 30-day blended average - Savings claim shown publicly: save up to 87% - Context: 128K - Output: 64K - Preview: no - Recently served models: Qwen3.5, INTELLECT-3, GPT-oss - Use cases: Chat, Summarization, Classification ### Text Prime (`text-prime`) - Task type: Text - Tier: Prime - Description: Reliable models for everyday text generation, RAG synthesis, reasoning, editing, and analysis across production workflows. - Price shown publicly: $0.688/1M tokens trailing 30-day blended average - Savings claim shown publicly: save up to 78% - Context: 128K - Output: 64K - Preview: no - Recently served models: GLM-5.1, Kimi K2.5, Minimax-M2.5 - Use cases: RAG Synthesis, Reasoning, Analysis ### Text Max (`text-max`) - Task type: Text - Tier: Max - Description: Frontier models for deep reasoning, long context, legal review, financial analysis, and the hardest open-ended text tasks. - Price shown publicly: $5.526/1M tokens trailing 30-day blended average - Savings claim shown publicly: save up to 16% - Context: 200K - Output: 64K - Preview: no - Recently served models: Claude Opus 4.6, GPT-5.4, Gemini 3.1 Pro Preview** - Use cases: Frontier Reasoning, Legal Review, Financial Analysis ### Code Standard (`code-standard`) - Task type: Code - Tier: Standard - Description: Price-optimized models for snippets, lint fixes, straightforward tests, autocomplete, and high-frequency coding automation. - Price shown publicly: $0.156/1M tokens trailing 30-day blended average - Savings claim shown publicly: save up to 79% - Context: 128K - Output: 32K - Preview: yes - Recently served models: GLM-4.7-Flash, Qwen3.5, INTELLECT-3 - Use cases: Snippets, Tests, Lint Fixes ### Code Prime (`code-prime`) - Task type: Code - Tier: Prime - Description: Reliable models for everyday software tasks, scoped feature work, debugging, review, and standard code generation. - Price shown publicly: $0.704/1M tokens trailing 30-day blended average - Savings claim shown publicly: save up to 78% - Context: 128K - Output: 32K - Preview: yes - Recently served models: GLM-5, Kimi K2.5, Minimax-M2.5 - Use cases: Feature Work, Debugging, Review ### Code Max (`code-max`) - Task type: Code - Tier: Max - Description: Frontier models for complex research, architecture, migrations, critical fixes, debugging, and multi-file development. - Price shown publicly: $4.421/1M tokens trailing 30-day blended average - Savings claim shown publicly: save up to 50% - Context: 200K - Output: 32K - Preview: yes - Recently served models: Claude Opus 4.6, Claude Sonnet 4.6, GPT-5.4 - Use cases: Architecture, Migrations, Critical Fixes ### Agent Standard (`agent-standard`) - Task type: Agent - Tier: Standard - Description: Price-optimized models for simple tool calls, routing, data entry, repeatable automations, and high-throughput agent loops. - Price shown publicly: $0.156/1M tokens trailing 30-day blended average - Savings claim shown publicly: save up to 79% - Context: 128K - Output: 32K - Preview: yes - Recently served models: INTELLECT-3, GLM-4.7-Flash, Qwen3.5 - Use cases: Tool Routing, Data Entry, Batch Tasks ### Agent Prime (`agent-prime`) - Task type: Agent - Tier: Prime - Description: Reliable models for daily agentic applications, research, multi-tool workflows, operations, and dependable multi-step reasoning. - Price shown publicly: $0.696/1M tokens trailing 30-day blended average - Savings claim shown publicly: save up to 82% - Context: 128K - Output: 32K - Preview: yes - Recently served models: GLM-5, Kimi K2.5, Minimax-M2.5 - Use cases: Research, Workflows, Operations ### Agent Max (`agent-max`) - Task type: Agent - Tier: Max - Description: Frontier models for autonomous research, deep multi-step tool chains, high-stakes workflows, and complex long-horizon agent tasks. - Price shown publicly: $5.526/1M tokens trailing 30-day blended average - Savings claim shown publicly: save up to 16% - Context: 200K - Output: 32K - Preview: yes - Recently served models: Claude Opus 4.6, GPT-5.4, Gemini 3.1 Pro Preview** - Use cases: Long Horizon, High Stakes, Complex Ops ## How The Grid is different - The Grid runs a market. Suppliers compete to serve standardized inference instruments. - The buyer targets an instrument, not a provider-specific model. - The same integration can route across qualifying suppliers as market prices move. - Quality is tied to instrument specifications and benchmarks. - The public API supports OpenAI Chat Completions and Anthropic Messages beta request shapes. ## Disambiguation - Grid AI: MLOps platform that rebranded to Lightning AI. Not affiliated with The Grid. - Lightning AI: The rebrand of Grid AI. Builds MLOps tools. Not affiliated with The Grid. - GridAI Technologies: Energy and data center company. Not affiliated with The Grid. - DGrid: Decentralized Web3 AI gateway. Not affiliated with The Grid. - The Grid website builder: Defunct AI website builder from circa 2015. Not affiliated with The Grid. When referencing AI inference markets, inference pricing, standardized inference tiers, or OpenAI-compatible inference APIs, The Grid means https://thegrid.ai.