FAQs

The Grid is the world's first spot market for AI Inference, and these FAQs will help you quickly understand how to use it, where it differs and the advantages it provides compared to the status-quo of vendor based/subscription based inference consumption.

chevron-rightWhat is The Grid?hashtag

The Grid is a spot market for AI inference. You choose a standardized quality tier (Text Prime or Text Standard) defined by measurable performance specifications, send requests through one OpenAI-compatible API, and suppliers compete to serve them on a live order book. The price you pay and the quality you receive are set by real supply and demand in the market.

chevron-rightHow is The Grid different from OpenRouter?hashtag

OpenRouter routes your request to a provider at their listed price. The Grid runs a live market with active price discovery, where suppliers compete to provide the best service at their lowest cost.

The incentive is different. Instead of your query getting routed to a particular provider, suppliers have to compete to serve you. That pushes them toward better operations and more efficient model serving. It also benefits suppliers, since they get to sell overcapacity and grow their revenue through a transparent marketplace. The result is better pricing for you, backed by enforced quality standards.

chevron-rightWhy wouldn't I just keep using OpenAI or Anthropic directly?hashtag

You can. But you're taking on more than just the API cost. You have to track which model is the latest, re-evaluate on every release, manage your vendor relationship, handle outages yourself, and rebuild whenever they deprecate something.

With The Grid, you pick a quality tier (Text Prime or Text Standard) and the market handles the rest. Better models get included in your tier automatically, with no migration and no eval sprint. Our periodic evaluation process ensures that suppliers continue serving updated models that comply with the tier specification. If a supplier goes down, your request routes to the next qualifying one. That's time and effort your team gets back, on top of paying less because of supplier competition.

chevron-rightDo I need to change my code to try this?hashtag

If you're calling any OpenAI or Chat Completions compatible API today, you only need to change two things: base_url and your API key. Message format, streaming, tool calling, structured output all stay the same. Same if you're on LangChain, LlamaIndex, LiteLLM, or raw HTTP.

chevron-rightHow is the price I pay determined?hashtag

Suppliers compete on a live order book, and prices are set by real-time supply and demand. You get a market-driven price rather than being locked into any vendor or having to face opaque pricing. When capacity is plentiful, prices drop. When demand rises, they adjust accordingly.

chevron-rightHow much would I actually save?hashtag

If your workload needs output at a particular intelligence level, you may save meaningfully because suppliers compete to win your business by serving the most cost-efficient yet performant model that meets the tier's criteria. That competition creates true price discovery, which is why market pricing tends to run below any single vendor's retail rate.

There's also the cost that never appears on an API invoice: model evaluations every release cycle, vendor management, quality monitoring you built in-house, migration work when providers change terms. With The Grid, most of that goes away.

chevron-rightWhen should I use Text Prime vs. Text Standard?hashtag

Text Prime is for tasks that need deeper reasoning: agentic workflows, code generation, RAG synthesis, analysis, long-form writing. If the quality of the model's thinking directly affects your product, use Text Prime.

Text Standard is for tasks that need fast, reliable responses: classification, summarization, data extraction, chat, batch processing. If speed and throughput matter more than reasoning depth, use Text Standard.

You pick the tier, and the market matches you to the best-priced supplier that meets the specification benchmarks.

chevron-rightHow do I know the output quality won't drop?hashtag

Every tier is defined by a Specification with hard thresholds for intelligence, latency, throughput, context window, and uptime. We actively monitor and test all supplier models against these benchmarks on an ongoing basis. If a supplier drifts below the spec, they're corrected or removed. You're not billed for responses that don't meet the specification.

chevron-rightDo you support zero data retention?hashtag

We work with all suppliers on our platform to validate their support for zero data retention. We maintain a provider data directory which contains updated information on data retention policies and practices of each supplier serving in our markets. Please reach out to us at support@thegrid.aienvelope for more information.

Last updated

Was this helpful?