# General Agent Skill

If you're building or running a custom agent harness, The Grid plugs in as a drop-in OpenAI-compatible backend. You don't need a tool-specific integration: any harness that issues HTTP calls or uses an OpenAI SDK can target The Grid by changing three values.

For the full instrument catalog, see [Current instruments](/docs/instrument-specifications/current-instruments.md). For framework-specific patterns (CrewAI, Deep Agents, DeerFlow, Hermes), see those integration pages.

## Connection details

* **Base URL:** `https://api.thegrid.ai/v1`
* **Auth header:** `Authorization: Bearer <YOUR_CONSUMPTION_KEY>`
* **Format:** OpenAI Chat Completions (drop-in for any OpenAI SDK)
* **Anthropic Messages beta:** `https://messages-beta.api.thegrid.ai/v1` with `x-api-key: <YOUR_CONSUMPTION_KEY>`
* **Model parameter:** an instrument string like `text-prime`, `code-prime`, or `agent-prime`

## Verify the key first

```bash
curl https://api.thegrid.ai/v1/chat/completions \
  -H "Authorization: Bearer $THEGRID_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"text-prime","messages":[{"role":"user","content":"ping"}]}'
```

A `200` confirms the consumption key works. `401` means the key is wrong or it's a Trading key (you need a Consumption key for inference). `402` means credits are empty.

## Drop-in client setup

{% tabs %}
{% tab title="Python" %}

```python
from openai import OpenAI

client = OpenAI(
    api_key="YOUR_GRID_CONSUMPTION_KEY",
    base_url="https://api.thegrid.ai/v1",
)

response = client.chat.completions.create(
    model="agent-prime",
    messages=[{"role": "user", "content": "Hello from The Grid"}],
)
print(response.choices[0].message.content)
```

{% endtab %}

{% tab title="JavaScript" %}

```javascript
import OpenAI from "openai";

const client = new OpenAI({
  apiKey: process.env.GRID_API_KEY,
  baseURL: "https://api.thegrid.ai/v1",
});

const response = await client.chat.completions.create({
  model: "agent-prime",
  messages: [{ role: "user", content: "Hello from The Grid" }],
});
console.log(response.choices[0].message.content);
```

{% endtab %}

{% tab title="curl" %}

```bash
curl https://api.thegrid.ai/v1/chat/completions \
  -H "Authorization: Bearer $THEGRID_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "agent-prime",
    "messages": [{"role": "user", "content": "Hello from The Grid"}]
  }'
```

{% endtab %}
{% endtabs %}

Same SDK, same message format. Change `base_url`, `api_key`, and `model`. Keep everything else.

## Resilient retry pattern

Long-running agents need to handle one edge case: if your harness goes idle and credits drop to zero, the first request after that returns `402`. With Auto-Reload enabled, the platform tops up credits asynchronously; your retry succeeds within a second or two. Wrap your client call:

{% tabs %}
{% tab title="Python" %}

```python
import openai, time

client = openai.OpenAI(
    api_key="YOUR_GRID_CONSUMPTION_KEY",
    base_url="https://api.thegrid.ai/v1",
)

def grid_completion(messages, model="agent-prime", max_retries=3):
    for attempt in range(max_retries):
        try:
            return client.chat.completions.create(model=model, messages=messages)
        except openai.APIStatusError as e:
            if e.status_code in (402, 503) and attempt < max_retries - 1:
                time.sleep(2 ** attempt)
                continue
            raise
```

{% endtab %}

{% tab title="JavaScript" %}

```javascript
import OpenAI from "openai";

const client = new OpenAI({
  apiKey: process.env.GRID_API_KEY,
  baseURL: "https://api.thegrid.ai/v1",
});

async function gridCompletion(messages, model = "agent-prime", maxRetries = 3) {
  for (let attempt = 0; attempt < maxRetries; attempt++) {
    try {
      return await client.chat.completions.create({ model, messages });
    } catch (e) {
      if ((e.status === 402 || e.status === 503) && attempt < maxRetries - 1) {
        await new Promise(r => setTimeout(r, 2 ** attempt * 1000));
        continue;
      }
      throw e;
    }
  }
}
```

{% endtab %}
{% endtabs %}

This pattern handles cold-start after idle periods. During continuous usage, Auto-Reload keeps credits topped up without interruption.

## Routing across instruments

A single agent run typically benefits from routing tasks across tiers:

* **Planning, deep reasoning, or large context.** `agent-max`, `text-max`, or `code-max`.
* **General execution.** `agent-prime`, `text-prime`, or `code-prime`. Most agent loops belong here.
* **Tool calls, classification, routing, batch edits.** `*-standard`. Fast and cheap for high-volume work.

Pick the instrument per call rather than running everything on one tier. See [Routing patterns](/docs/integrations-and-best-practices/routing-patterns.md) for the full playbook.

## What works out of the box

Every Grid instrument supports streaming, tool calling, JSON mode, and structured outputs in the OpenAI format. Use `stream=True`, `tools=[...]`, and `response_format=...` exactly as you would against OpenAI directly.

## Troubleshooting

* **`401 Unauthorized`.** The key is wrong, or it's a Trading key (Consumption keys authenticate inference). Generate a Consumption key from your dashboard.
* **`402 Payment Required`.** Credits are empty. Top up or enable Auto-Reload at [app.thegrid.ai](https://app.thegrid.ai). Auto-Reload pulls funds from your saved payment method when credits drop below the threshold you configure.
* **`404 Not Found` on a model.** The model string didn't match a current instrument. Use one of the IDs from [Current instruments](/docs/instrument-specifications/current-instruments.md).
* **`5xx` from a supplier.** The matching engine occasionally hits a transient supplier failure. Retry with the same backoff pattern as `402`.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://thegrid.ai/docs/integrations-and-best-practices/integrations/general-agent-skill.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
