Best practices
Once you start using The Grid, you migrate your workloads from provider-specific models to Instruments: which provide right to consume inference defined by standardized specifications, which can fulfilled by any supplier that meets the criteria.
We suggest the following guidelines to make this transition smooth and to ensure maxmum performance and high reliability.
What Use Cases are ideal fit for Instruments?
Suggested Use Cases:
General reasoning and conversational workflows: chatbots, assistants, and interactive product UX
Summarization at scale: meeting notes, long documents, and multi-part text summaries
Document Q&A: retrieval + answer generation over internal docs, policies, and knowledge bases
Code assist for everyday engineering: easy to medium code generation, code explanation, and bug fixes (think editor helper, not full autonomous coding)
Parsing Structured outputs: JSON-like schemas and predictable response formats for downstream parsing
Lightweight tool usage: calling a few tools as part of a workflow when needed
More generally, here are some great first workloads you can migrate:
Latency-sensitive user-facing chat experiences
High-volume internal automations (summaries, triage, classification, routing)
Code assistant workflows for common tasks (snippets, refactors, fixes)
A few considerations before choosing Instruments:
If a workflow is sensitive to a particular model’s style, run a quick side-by-side eval against the Instrument spec before fully switching.
Keep prompts portable and stick to documented Instrument parameters so any conforming Supplier can serve your requests reliably.
Mapping other chat APIs to The Grid
The Consumption API uses the Chat Completions API Interface. However, to use Instruments, we pass the instrument_id in the model parameter field.
Conceptual mapping:
model→instrument_id(e.g., "gpt-5" → "chat-fast" or "chat-prime")base_url→ Grid consumption base URLapi_key→ Grid consumption key
Example request change:
Note: GRID uses the OpenAI Chat Completions API format. Native Anthropic and Google APIs (like Messages, Responses) are not currently supported. Use the OpenAI SDK as shown above.
Best practices for modifying prompts when using Instruments
When using an Instrument, your prompts should be portable so any model serving the instrument provides acceptable inference.
Use a strict allowlist for request parameters
Send only officially supported parameters. Treat everything else as best-effort.
Official list of suppported parameters in the Consumption API will be published soon.
Plan for batch workloads with custom queueing
Key considerations:
Estimate capacity before starting
Pre-buy capacity, then start when available
Treat rate limits and concurrency as first-class controls
Monitor capacity continuously during run
Decide what happens if you run low mid-run
See the Batching Sequencer example script for our guideline implementation.
Last updated
Was this helpful?