# LiteLLM

LiteLLM normalizes 100+ LLM providers to the OpenAI format. The Grid plugs in as an OpenAI-compatible backend. You can call The Grid directly from the `litellm` Python SDK, or run the LiteLLM proxy in front of The Grid so any downstream OpenAI-compatible tool routes through a single endpoint.

* **Quick version.** Prefix Grid models with `openai/` so LiteLLM uses its OpenAI-compatible client. Set `api_base: https://api.thegrid.ai/v1` and `api_key: os.environ/THEGRID_API_KEY`. For the proxy, run `uv run litellm --config config.yaml`.

For the full instrument catalog, see [Current instruments](/docs/instrument-specifications/current-instruments.md).

## Two patterns

* **SDK.** Your Python app calls `litellm.completion(...)` with The Grid as a backend. LiteLLM handles retries, fallbacks, and response normalization.
* **Proxy.** Run `uv run litellm --config config.yaml` as a standalone server. All your agents, IDEs, and scripts point at the proxy. The proxy fans out to The Grid (and optionally OpenAI, Anthropic, local models) from one `model_list`.

In both patterns, The Grid is the upstream backend. The `openai/` prefix on `model` selects LiteLLM's OpenAI-compatible client; `api_base` does the rest.

## Prerequisites

* Python 3.8+ and `uv add "litellm[proxy]"` (the `[proxy]` extras cover both SDK and proxy).
* A Grid account at [app.thegrid.ai](https://app.thegrid.ai) with funded credits.
* A consumption API key from your dashboard, exported as `THEGRID_API_KEY`.

## Pattern 1: LiteLLM SDK

{% tabs %}
{% tab title="Python" %}

```python
import os
import litellm

response = litellm.completion(
    model="openai/agent-prime",
    api_base="https://api.thegrid.ai/v1",
    api_key=os.environ["THEGRID_API_KEY"],
    messages=[{"role": "user", "content": "Plan a three-step research task."}],
)

print(response.choices[0].message.content)
```

{% endtab %}
{% endtabs %}

The `openai/` prefix is required. Streaming, tool calling, and structured outputs work the same way as any other OpenAI-compatible call (`stream=True`, `tools=[...]`, `response_format=...`).

## Pattern 2: LiteLLM Proxy

Run LiteLLM as a gateway. Any OpenAI-compatible client points at `http://localhost:4000`.

### 1. Install and set keys

```bash
uv add "litellm[proxy]"
export THEGRID_API_KEY=<your-grid-consumption-key>
export LITELLM_MASTER_KEY=sk-1234   # any string; downstream clients send this as Bearer
```

### 2. Write `config.yaml`

```yaml
model_list:
  - model_name: grid/text-prime
    litellm_params:
      model: openai/text-prime
      api_base: https://api.thegrid.ai/v1
      api_key: os.environ/THEGRID_API_KEY
  - model_name: grid/code-prime
    litellm_params:
      model: openai/code-prime
      api_base: https://api.thegrid.ai/v1
      api_key: os.environ/THEGRID_API_KEY
  - model_name: grid/agent-prime
    litellm_params:
      model: openai/agent-prime
      api_base: https://api.thegrid.ai/v1
      api_key: os.environ/THEGRID_API_KEY
  # Add the other six instruments the same way; see Current instruments.

litellm_settings:
  num_retries: 3
  request_timeout: 300
  drop_params: True

general_settings:
  master_key: os.environ/LITELLM_MASTER_KEY
```

Field-name gotchas: use `api_base` (not `base_url`), and `os.environ/NAME` (not `${NAME}`) for env-var references.

### 3. Start the proxy

```bash
uv run litellm --config config.yaml
# proxy runs on http://0.0.0.0:4000 by default
```

### 4. Verify with curl

```bash
curl http://localhost:4000/v1/chat/completions \
  -H "Authorization: Bearer $LITELLM_MASTER_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model": "grid/agent-prime", "messages": [{"role": "user", "content": "hello"}]}'
```

A 200 means LiteLLM accepted your master key, resolved `grid/agent-prime` to `openai/agent-prime` at `api.thegrid.ai/v1`, and returned a completion.

## Point downstream tools at the proxy

Once the proxy is up, any OpenAI-compatible tool (Cline, Continue, your Python scripts) can use it as the inference backend:

* **Base URL:** `http://localhost:4000` (or `http://localhost:4000/v1` depending on the client)
* **API key:** your `LITELLM_MASTER_KEY`
* **Model name:** any `model_name` from your `model_list` (e.g., `grid/code-prime`)

Add or swap a backend once in `config.yaml` and every downstream tool picks up the change on proxy restart.

## Stable aliases for multi-provider setups

If you want stable route names regardless of which backend fulfills them, alias them in `model_list`:

```yaml
model_list:
  - model_name: default
    litellm_params:
      model: openai/text-prime
      api_base: https://api.thegrid.ai/v1
      api_key: os.environ/THEGRID_API_KEY
  - model_name: openai-fallback
    litellm_params:
      model: openai/gpt-4o-mini
      api_key: os.environ/OPENAI_API_KEY

router_settings:
  fallbacks:
    - default: ["openai-fallback"]
```

Your app calls `model="default"` and never touches a vendor name. Swap the backend in one place.

## Troubleshooting

* **401 Unauthorized.** Verify the key is a consumption key (not a trading key), `THEGRID_API_KEY` is exported in the shell that started LiteLLM, and `api_base` is exactly `https://api.thegrid.ai/v1` with no trailing slash. Test the key with `curl https://api.thegrid.ai/v1/chat/completions -H "Authorization: Bearer $THEGRID_API_KEY" -d '...'`.
* **402 Payment Required.** Credits are empty. Top up at [app.thegrid.ai](https://app.thegrid.ai). LiteLLM retries up to `num_retries` then falls back if configured, but a 402 won't clear until you add funds.
* **"model not found" downstream.** The `model` your client sends must match a `model_name` in `model_list` exactly. `grid/code-prime` is not the same as `grid-code-prime`.
* **Cost tracking is approximate.** LiteLLM tracks cost from a static pricing table. The Grid is a market with live pricing; LiteLLM's numbers are a rough guide. Source of truth is your spend in the [app dashboard](https://app.thegrid.ai).


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://thegrid.ai/docs/integrations-and-best-practices/integrations/litellm.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
