Best practices

Once you start using The Grid, you migrate your workloads from provider-specific models to Instruments: which provide right to consume inference defined by standardized specifications, which can fulfilled by any supplier that meets the criteria.

We suggest the following guidelines to make this transition smooth and to ensure maxmum performance and high reliability.

What Use Cases are ideal fit for Instruments?

  1. Suggested Use Cases:

    1. General reasoning and conversational workflows: chatbots, assistants, and interactive product UX

    2. Summarization at scale: meeting notes, long documents, and multi-part text summaries

    3. Document Q&A: retrieval + answer generation over internal docs, policies, and knowledge bases

    4. Code assist for everyday engineering: easy to medium code generation, code explanation, and bug fixes (think editor helper, not full autonomous coding)

    5. Parsing Structured outputs: JSON-like schemas and predictable response formats for downstream parsing

    6. Lightweight tool usage: calling a few tools as part of a workflow when needed

  2. More generally, here are some great first workloads you can migrate:

    1. Latency-sensitive user-facing chat experiences

    2. High-volume internal automations (summaries, triage, classification, routing)

    3. Code assistant workflows for common tasks (snippets, refactors, fixes)

  3. A few considerations before choosing Instruments:

    1. If a workflow is sensitive to a particular model’s style, run a quick side-by-side eval against the Instrument spec before fully switching.

    2. Keep prompts portable and stick to documented Instrument parameters so any conforming Supplier can serve your requests reliably.

Mapping other chat APIs to The Grid

The Consumption API uses the Chat Completions API Interface. However, to use Instruments, we pass the instrument_id in the model parameter field.

Conceptual mapping:

  • modelinstrument_id (e.g., "gpt-5" → "chat-fast" or "chat-prime")

  • base_url → Grid consumption base URL

  • api_key → Grid consumption key

Example request change:

Note: GRID uses the OpenAI Chat Completions API format. Native Anthropic and Google APIs (like Messages, Responses) are not currently supported. Use the OpenAI SDK as shown above.

Best practices for modifying prompts when using Instruments

When using an Instrument, your prompts should be portable so any model serving the instrument provides acceptable inference.

chevron-rightUse a strict allowlist for request parametershashtag

Send only officially supported parameters. Treat everything else as best-effort.

circle-info

Official list of suppported parameters in the Consumption API will be published soon.

chevron-rightPrompt for outcomes, not model quirkshashtag

Good prompt (portable):

Avoid (model-specific):

chevron-right Validate outputs in your app layerhashtag

chevron-right Plan for batch workloads with custom queueinghashtag

Key considerations:

  • Estimate capacity before starting

  • Pre-buy capacity, then start when available

  • Treat rate limits and concurrency as first-class controls

  • Monitor capacity continuously during run

  • Decide what happens if you run low mid-run

See the Batching Sequencer example script for our guideline implementation.

Last updated

Was this helpful?