Service Tiers

The service_tier parameter lets you control cost and latency tradeoffs when sending requests through OpenRouter. You can pass it in your request to select a specific processing tier, and the response will indicate which tier was actually used. Your request is billed at the actual served tier’s rate.

Using Service Tiers

Pass service_tier as a top-level parameter in your request body. Supported values are flex (lower cost, higher latency) and priority (faster, higher cost). The example below requests the flex tier from OpenAI’s gpt-5 for a 50% discount in exchange for higher latency and lower availability. The service_tier parameter is also accepted on the Responses API and the Anthropic Messages API — see API Response Differences below for where the response field is returned in each.

Anthropic Messages API

curl https://openrouter.ai/api/v1/messages \
  -H "Authorization: Bearer <OPENROUTER_API_KEY>" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "openai/gpt-5",
    "service_tier": "flex",
    "max_tokens": 1024,
    "messages": [
      { "role": "user", "content": "What is the meaning of life?" }
    ]
  }'

Supported Providers

The following providers support flex and priority service tiers for select models:

OpenAI
Google Vertex
Google AI Studio

The response’s service_tier field reports which tier was actually used. Possible response values are default, flex, priority, or null when no service tier is available from upstream. Note that OpenRouter normalizes provider-equivalent base tier labels, such as Google’s standard, to default — except in the Anthropic Messages API, which preserves standard to match Anthropic’s spec (see API Response Differences below). Provider documentation:

OpenAI: Chat Completions, Responses, and pricing
Google Vertex: Flex and Priority
Google AI Studio: Flex and Priority

API Response Differences

The API response includes a service_tier field that indicates which capacity tier was actually used to serve your request. The placement of this field varies by API format:

Chat Completions API (/api/v1/chat/completions): service_tier is returned at the top level of the response object, matching OpenAI’s native format.
Responses API (/api/v1/responses): service_tier is returned at the top level of the response object, matching OpenAI’s native format.
Messages API (/api/v1/messages): service_tier is returned inside the usage object, matching Anthropic’s native format.

`service_tier` value in the Messages API

Anthropic’s spec uses standard rather than the OpenAI-style default as the base tier label. So the Messages API returns service_tier: "standard" where the Chat Completions and Responses APIs return "default". Other tier values are returned unchanged.

​Service Tiers

​Using Service Tiers

​Supported Providers

​API Response Differences

​service_tier value in the Messages API

Service Tiers

Using Service Tiers

Supported Providers

API Response Differences

`service_tier` value in the Messages API