Introduction
AnyGpu provides unified access to 200+ AI models from multiple providers through a single, standardized API. Built for developers who want flexibility without vendor lock-in.
📦 Unified API
One API to access models from OpenAI, Anthropic, Google, Meta, DeepSeek, and more.
⚡ Low Latency
Optimized routing ensures minimal latency for your production workloads.
💰 Cost Effective
Competitive pricing with token-level cost tracking and spending limits.
Quick Start
Get up and running with AnyGpu in under 5 minutes.
Create an Account
Sign up for free and receive your API key instantly.
Get Your API Key
Navigate to the Console to copy your API key.
Make Your First Request
curl https://api.anygpu.com/v1/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model": "gpt-4o", "messages": [{"role": "user", "content": "Hello!"}]}'
Authentication
AnyGpu uses API key authentication. Include your API key in the Authorization header of every request.
Authorization: Bearer sk-anygpu-xxxxxxxxxxxx
Models
AnyGpu supports 200+ models across all major providers. Browse the full catalog at the Models page.
| Model | Provider | Context | Best For |
|---|---|---|---|
| GPT-4o | OpenAI | 128K | General purpose |
| Claude 3.5 Sonnet | Anthropic | 200K | Long context tasks |
| Gemini 2.0 Flash | 1M | High volume, low cost | |
| Llama 4 | Meta | 128K | Open source projects |
| DeepSeek V3 | DeepSeek | 64K | Coding tasks |
Chat Completion
The Chat Completion API follows the OpenAI format, making it easy to switch from any provider.
{
"model": "gpt-4o",
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "What is AnyGpu?"}
],
"max_tokens": 512,
"temperature": 0.7
}
Streaming
Enable streaming for real-time responses in chat interfaces.
curl https://api.anygpu.com/v1/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-d '{"model": "gpt-4o", "messages": [{"role": "user", "content": "Hello!"}], "stream": true}'
Endpoints
/v1/chat/completions
Chat completion
/v1/models
List available models
/v1/models/{id}
Get model info
/v1/me
Current user info
Request Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
| model | string | Yes | Model identifier |
| messages | array | Yes | Array of message objects |
| max_tokens | integer | No | Max tokens to generate |
| temperature | float | No | Sampling temperature (0–2) |
| stream | boolean | No | Enable streaming |
Error Codes
| Code | Message | Description |
|---|---|---|
| 401 | Invalid API key | Check your API key format and validity |
| 429 | Rate limit exceeded | Slow down requests or upgrade plan |
| 500 | Internal server error | Retry with backoff |
| 503 | Model unavailable | Try a different model or wait |
Pricing
Pay per token. No subscriptions, no commitments. View real-time pricing on the Models page.
Free Tier
- 1,000 tokens/day
- 5 models
- Community support
Pro
- Unlimited tokens
- All 200+ models
- Priority support
- Usage analytics
Enterprise
- Dedicated capacity
- SLA guarantee
- Custom models
- Dedicated support
Billing
Billing is calculated per token with granular cost tracking per model. Set spending limits to control costs.