Consumption API

The Consumption API is what you call to run inference. Two surfaces (OpenAI Chat Completions and Anthropic Messages) sit in front of the same routing engine.

The Consumption API is what you call to run inference. It's the API behind the Quickstart and the one most teams ever touch. Two surfaces sit in front of the same routing engine:

  • OpenAI Chat Completions at https://api.thegrid.ai/v1 with Authorization: Bearer YOUR_KEY.

  • Anthropic Messages (beta) at https://messages-beta.api.thegrid.ai/v1 with x-api-key: YOUR_KEY.

Same instruments, same routing, same Consumption key. Pick whichever request shape fits your stack. The canonical instrument list lives in Current instruments; pass the instrument name (Text-Prime, Code-Max, Vision-Standard, etc.) in the model field.

Create a chat completion

Create chat completion

post

Creates a chat completion for the provided messages using the specified model. This endpoint is compatible with the OpenAI chat completions API format.

The request is validated and then routed to the appropriate LLM provider. Your account balance is checked before processing.

Authorizations
AuthorizationstringRequired

Bearer token authentication using your API key.

Include your API key in the Authorization header:

Authorization: Bearer YOUR_API_KEY

Generate API keys from your Grid Dashboard.

Body
modelstringOptionalExample: text-prime
max_tokensinteger · min: 1 · nullableOptionalDeprecated
max_completion_tokensinteger · min: 1 · nullableOptional
temperaturenumber · max: 2 · nullableOptionalDefault: 1
top_pnumber · max: 1 · nullableOptionalDefault: 1
ninteger · min: 1 · max: 128 · nullableOptionalDefault: 1
streamboolean · nullableOptionalDefault: false
stopone of · nullableOptional
stringOptional
or
string[] · max: 4Optional
logprobsboolean · nullableOptionalDefault: false
top_logprobsinteger · max: 20 · nullableOptional
response_formatone of · nullableOptional
or
or
tool_choiceone of · nullableOptional
string · enumOptionalPossible values:
or
parallel_tool_callsbooleanOptionalDefault: true
frequency_penaltynumber · min: -2 · max: 2 · nullableOptionalDefault: 0
presence_penaltynumber · min: -2 · max: 2 · nullableOptionalDefault: 0
seedinteger · nullableOptional
service_tierstring · enum · nullableOptionalPossible values:
reasoning_effortstring · enum · nullableOptionalPossible values:
storeboolean · nullableOptional
verbositystring · enum · nullableOptionalPossible values:
prompt_cache_keystring · nullableOptional
prompt_cache_retentionstring · enum · nullableOptionalPossible values:
safety_identifierstring · nullableOptional
routestring · enumOptionalPossible values:
providerobjectOptional
userstringOptionalDeprecated
Responses
chevron-right
200

Chat completion response

application/json
idstringRequiredExample: chatcmpl-abc123
objectstring · enumRequiredPossible values:
createdintegerRequiredExample: 1677858242
modelstringRequiredExample: text-prime
system_fingerprintstring · nullableOptional
service_tierstring · enum · nullableOptionalPossible values:
post
/chat/completions

Create a message (Anthropic-compatible)

Beta

Create a message

post

Send a structured list of input messages and receive a model-generated response using the Anthropic-compatible Messages format. This is the endpoint that powers Claude Code.

This endpoint is in beta and is served from its own dedicated base URL https://messages-beta.api.thegrid.ai/v1.

This endpoint authenticates via the x-api-key header (not Authorization: Bearer), matching the Anthropic API convention.

For a step-by-step Claude Code setup guide, see Use The Grid in Claude Code.

Authorizations
x-api-keystringRequired

Anthropic-compatible API key authentication for the [BETA] Messages endpoint.

Include your API key in the x-api-key header:

x-api-key: YOUR_API_KEY

Generate API keys from your Grid Dashboard.

Body
modelstringRequiredExample: text-prime
max_tokensinteger · min: 1RequiredExample: 1024
systemone of · nullableOptional
stringOptional
or
stop_sequencesstring[] · nullableOptional
streamboolean · nullableOptionalDefault: false
temperaturenumber · nullableOptional
top_pnumber · nullableOptional
top_kinteger · nullableOptional
service_tierstring · enum · nullableOptionalPossible values:
Responses
chevron-right
200

Message response

application/json
idstringRequiredExample: msg_01XFDUDYJgAACzvnptvVoYEL
typestring · enumRequiredPossible values:
rolestring · enumRequiredPossible values:
modelstringRequiredExample: text-prime
stop_reasonstring · enum · nullableRequiredPossible values:
stop_sequencestring · nullableOptional
post
/messages

Quick example

A minimal chat completion call:

The -L flag is required. The Consumption API returns a 307 Temporary Redirect on the way to the supplier endpoint, and cURL does not follow redirects by default. The OpenAI and Anthropic SDKs handle this automatically. See Request routing and redirects for details.

Where next

Last updated

Was this helpful?