openrouter/auto) automatically selects the best model for your prompt, powered by NotDiamond.
Overview
Instead of manually choosing a model, let the Auto Router analyze your prompt and select the optimal model from a curated set of high-quality options. The router considers factors like prompt complexity, task type, and model capabilities.Usage
Set your model toopenrouter/auto:
Response
The response includes themodel field showing which model was actually used:
How It Works
- Prompt Analysis: Your prompt is analyzed by NotDiamond’s routing system
- Model Selection: The optimal model is selected based on the task requirements
- Request Forwarding: Your request is forwarded to the selected model
- Response Tracking: The response includes metadata showing which model was used
Session Stickiness
The Auto Router pins both the selected model and provider so that subsequent requests in the same conversation route to the same place. This ensures consistent behavior within a conversation and maximizes prompt cache hits. Stickiness applies at two levels:- Implicit (automatic): OpenRouter derives a conversation fingerprint from your messages (hashing the first system message and first user message). Once the provider reports prompt cache usage, the model and provider are pinned for that conversation. No configuration needed.
- Explicit (
session_id): When you include asession_id, stickiness kicks in on the first successful response — even before cache usage is observed. This is recommended for multi-turn conversations and agent workflows where you want consistent routing from the start.
x-session-id header, see Provider Sticky Routing.
Example with session_id
Why It Matters for the Auto Router
Unlike using a fixed model, the Auto Router selects a different model each time based on your prompt. Session stickiness is especially important here because it also pins the model selection — not just the provider. Without it, you could get different models on each turn of a conversation, leading to inconsistent behavior and wasted prompt cache.Supported Models
The Auto Router selects from a curated set of high-quality models including:- Claude Sonnet 4.5 (
anthropic/claude-sonnet-4.5) - Claude Opus 4.5 (
anthropic/claude-opus-4.5) - GPT-5.1 (
openai/gpt-5.1) - Gemini 3.1 Pro (
google/gemini-3.1-pro-preview) - DeepSeek 3.2 (
deepseek/deepseek-v3.2) - And other top-performing models
Configuring Allowed Models
You can restrict which models the Auto Router can select from using theplugins parameter. This is useful when you want to limit routing to specific providers or model families.
Via API Request
Use wildcard patterns to filter models. For example,anthropic/* matches all Anthropic models:
Via Settings UI
You can also configure default allowed models in your Plugin Settings:- Navigate to Settings > Plugins
- Find Auto Router and click the configure button
- Enter model patterns (one per line)
- Save your settings
Pattern Syntax
| Pattern | Matches |
|---|---|
anthropic/* | All Anthropic models |
openai/gpt-5* | All GPT-5 variants |
google/* | All Google models |
openai/gpt-5.1 | Exact match only |
*/claude-* | Any provider with claude in model name |
Cost / Quality Tradeoff
Control how aggressively the Auto Router optimizes for cost vs. quality using thecost_quality_tradeoff parameter (integer, 0–10):
- 0 = pure quality — always picks the most capable model regardless of cost
- 10 = maximize for cost — cheapest model wins
- Intermediate values blend quality and cost signals continuously
Via API Request
Via Settings UI
You can also set a default tradeoff in your Plugin Settings under Auto Router. The per-request value overrides this default.Pricing
You pay the standard rate for whichever model is selected. There is no additional fee for using the Auto Router.Use Cases
- General-purpose applications: When you don’t know what types of prompts users will send
- Cost optimization: Let the router choose efficient models for simpler tasks
- Quality optimization: Ensure complex prompts get routed to capable models
- Experimentation: Discover which models work best for your use case
Limitations
- The router requires
messagesformat (notprompt) - Streaming is supported
- All standard OpenRouter features (tool calling, etc.) work with the selected model
Related
- Body Builder - Generate multiple parallel API requests
- Latest Model Resolution - Always target the newest version of a model family
- Model Fallbacks - Configure fallback models
- Provider Selection - Control which providers are used