Best practices for development
Production patterns for The Grid. Tier routing, two-balance accounting, retry-on-retryable, streaming, FIFO consumption, Auto Mode limits, instrument pinning.
1. Use the right tier per task
2. Monitor consumption balance and credits separately
3. Retry on retryable errors
import time, random
from openai import OpenAI, APIStatusError
client = OpenAI(base_url="https://api.thegrid.ai/v1", api_key="...")
def call_with_retry(messages, max_retries=5):
for attempt in range(max_retries):
try:
return client.chat.completions.create(
model="text-prime",
messages=messages,
)
except APIStatusError as e:
if e.status_code in (402, 429, 500, 503) and attempt < max_retries - 1:
wait = (2 ** attempt) + random.uniform(0, 1)
time.sleep(wait)
continue
raise4. Use streaming where it matters
5. Tokens consume oldest-first (FIFO)
6. Set sensible Auto Mode limits
7. Pin instruments by string
8. Plan for instrument tier changes
Last updated
Was this helpful?