Best practices

Once you start using The Grid, you migrate your workloads from provider-specific models to Instruments: which provide right to consume inference defined by standardized specifications, which can fulfilled by any supplier that meets the criteria.

We suggest the following guidelines to make this transition smooth and to ensure maxmum performance and high reliability.

What Use Cases are ideal fit for Instruments?

Suggested Use Cases:
1. General reasoning and conversational workflows: chatbots, assistants, and interactive product UX
2. Summarization at scale: meeting notes, long documents, and multi-part text summaries
3. Document Q&A: retrieval + answer generation over internal docs, policies, and knowledge bases
4. Code assist for everyday engineering: easy to medium code generation, code explanation, and bug fixes (think editor helper, not full autonomous coding)
5. Parsing Structured outputs: JSON-like schemas and predictable response formats for downstream parsing
6. Lightweight tool usage: calling a few tools as part of a workflow when needed
More generally, here are some great first workloads you can migrate:
1. Latency-sensitive user-facing chat experiences
2. High-volume internal automations (summaries, triage, classification, routing)
3. Code assistant workflows for common tasks (snippets, refactors, fixes)
A few considerations before choosing Instruments:
1. If a workflow is sensitive to a particular model’s style, run a quick side-by-side eval against the Instrument spec before fully switching.
2. Keep prompts portable and stick to documented Instrument parameters so any conforming Supplier can serve your requests reliably.

Mapping other chat APIs to The Grid

The Consumption API uses the Chat Completions API Interface. However, to use Instruments, we pass the instrument_id in the model parameter field.

Conceptual mapping:

model → instrument_id (e.g., "gpt-5" → "chat-fast" or "chat-prime")
base_url → Grid consumption base URL
api_key → Grid consumption key

Example request change:

import os
import openai

client = openai.OpenAI(
    api_key=os.environ['OPENAI_API_KEY'],
)

response = client.chat.completions.create(
    model='gpt-5',
    messages=[
        {'role': 'user', 'content': 'Hello!'},
    ],
)

print('Response:', response.choices[0].message.content)

import os
import openai

client = openai.OpenAI(
    api_key=os.environ['GRID_CONSUMPTION_API_KEY'],
    base_url='<https://consumption.api.thegrid.ai/api/v1>'
)

response = client.chat.completions.create(
    model='chat-fast',
    messages=[
        {'role': 'user', 'content': 'Hello!'},
    ],
)

print('Response:', response.choices[0].message.content)

Note: GRID uses the OpenAI Chat Completions API format. Native Anthropic and Google APIs (like Messages, Responses) are not currently supported. Use the OpenAI SDK as shown above.

Best practices for modifying prompts when using Instruments

When using an Instrument, your prompts should be portable so any model serving the instrument provides acceptable inference.

Use a strict allowlist for request parameters

Send only officially supported parameters. Treat everything else as best-effort.

Official list of suppported parameters in the Consumption API will be published soon.

Prompt for outcomes, not model quirks

Good prompt (portable):

const messages = [
  {
    role: 'system',
    content: 'Summarize the meeting notes and provide a checklist with all accomplished tasks and outstanding items.'
  },
  {
    role: 'user',
    content: 'Hey everyone welcome, today we are going to..."'
  }
];

Avoid (model-specific):

// Don't rely on specific model personalities or formatting habits
const messages = [
  {
    role: 'system',
    content: 'You are Claude, an AI assistant made by Anthropic...' // Too specific!
  }
];

Validate outputs in your app layer

async function chatWithValidation(consumptionKey, request) {
  const result = await callGridWithRetries(consumptionKey, request);

  try {
    const content = result.choices[0].message.content;

    // If expecting JSON, parse and validate
    if (request.response_format?.type === 'json') {
      const parsed = JSON.parse(content);

      // Validate required fields
      if (!parsed.summary || !parsed.sentiment) {
        throw new Error('Missing required fields in response');
      }

      return parsed;
    }

    return content;

  } catch (error) {
    console.warn('Output validation failed, retrying once...');

    // Retry once
    const retryResult = await callGridWithRetries(consumptionKey, request, 1);
    return retryResult.choices[0].message.content;
  }
}

Plan for batch workloads with custom queueing

Key considerations:

Estimate capacity before starting
Pre-buy capacity, then start when available
Treat rate limits and concurrency as first-class controls
Monitor capacity continuously during run
Decide what happens if you run low mid-run

See the Batching Sequencer example script for our guideline implementation.

PreviousBatch Sequencing with Limit Orders

Last updated 13 days ago

Was this helpful?

Good afternoon

hashtagWhat Use Cases are ideal fit for Instruments?

hashtagMapping other chat APIs to The Grid

hashtagBest practices for modifying prompts when using Instruments

What Use Cases are ideal fit for Instruments?

Mapping other chat APIs to The Grid

Best practices for modifying prompts when using Instruments