Create a chat completion - OpenRouter

Authorizations

Authorization

string

header

required

API key as bearer token in Authorization header

Headers

X-OpenRouter-Metadata

enum<string>

Opt-in to surface routing metadata on the response under openrouter_metadata. Defaults to disabled. The legacy header X-OpenRouter-Experimental-Metadata is also accepted for backward compatibility. Opt-in level for surfacing routing metadata on the response under openrouter_metadata.

Available options:

disabled,

enabled

Example:

"enabled"

Body

application/json

Chat completion request parameters

messages

object[]

required

List of messages for the conversation

Minimum array length: 1

Chat completion message with role-based discrimination

Option 1
Option 2
Option 3
Option 4
Option 5

Show child attributes

Example:

{
  "content": "What is the capital of France?",
  "name": "Assistant Config",
  "role": "user"
}

Example:

[{ "content": "Hello!", "role": "user" }]

cache_control

object

Enable automatic prompt caching. When set at the top level, the system automatically applies cache breakpoints to the last cacheable block in the request. Currently supported for Anthropic Claude models.

Show child attributes

Example:

{ "type": "ephemeral" }

debug

object

Debug options for inspecting request transformations (streaming only)

Show child attributes

Example:

{ "echo_upstream_body": true }

frequency_penalty

number<double> | null

Frequency penalty (-2.0 to 2.0)

Example:

0

image_config

object

Provider-specific image configuration options. Keys and values vary by model/provider. See https://openrouter.ai/docs/guides/overview/multimodal/image-generation for more details.

Show child attributes

Example:

{ "aspect_ratio": "16:9", "quality": "high" }

logit_bias

object | null

Token logit bias adjustments

Show child attributes

Example:

{ "50256": -100 }

logprobs

boolean | null

Return log probabilities

Example:

false

max_completion_tokens

integer | null

Maximum tokens in completion

Example:

100

max_tokens

integer | null

Maximum tokens (deprecated, use max_completion_tokens). Note: some providers enforce a minimum of 16.

Example:

100

metadata

object

Key-value pairs for additional object information (max 16 pairs, 64 char keys, 512 char values)

Show child attributes

Example:

{
  "session_id": "session-456",
  "user_id": "user-123"
}

min_p

number<double> | null

Minimum probability threshold relative to the most likely token. Tokens with probability below min_p * (probability of top token) are filtered out. Not all providers support this parameter.

Example:

0.1

modalities

enum<string>[]

Output modalities for the response. Supported values are "text", "image", and "audio".

Available options:

text,

image,

audio

Example:

["text", "image"]

model

string

Model to use for completion

Example:

"openai/gpt-4"

models

string[]

Models to use for completion

Available OpenRouter chat completion models

Example:

["openai/gpt-4", "openai/gpt-4o"]

parallel_tool_calls

boolean | null

Whether to enable parallel function calling during tool use. When true, the model may generate multiple tool calls in a single response.

Example:

true

plugins

object[]

Plugins you want to enable for this request, including their settings.

Show child attributes

Example:

{
  "allowed_models": ["anthropic/*", "openai/gpt-4o"],
  "cost_quality_tradeoff": 7,
  "enabled": true,
  "id": "auto-router"
}

presence_penalty

number<double> | null

Presence penalty (-2.0 to 2.0)

Example:

0

provider

object | null

When multiple model providers are available, optionally indicate your routing preference.

Show child attributes

Example:

{ "allow_fallbacks": true }

reasoning

object

Configuration options for reasoning models

Show child attributes

Example:

{ "effort": "medium", "summary": "concise" }

reasoning_effort

enum<string> | null

Shorthand for setting reasoning effort. Equivalent to setting reasoning.effort. Cannot be used simultaneously with reasoning.effort if they differ.

Available options:

max,

xhigh,

high,

medium,

low,

minimal,

none,

null

Example:

"medium"

repetition_penalty

number<double> | null

Penalizes tokens based on how much they have already appeared in the text. A value of 1.0 means no penalty. Values above 1.0 penalize repeated tokens more strongly. Not all providers support this parameter.

Example:

1

response_format

object

Response format configuration

Option 1
Option 2
Option 3
Option 4
Option 5

Show child attributes

Example:

{ "type": "json_object" }

route

enum<string> | null

deprecated

DEPRECATED Use providers.sort.partition instead. Backwards-compatible alias for providers.sort.partition. Accepts legacy values: "fallback" (maps to "model"), "sort" (maps to "none").

Available options:

fallback,

sort,

null

Example:

"fallback"

seed

integer | null

Random seed for deterministic outputs

Example:

42

service_tier

enum<string> | null

The service tier to use for processing this request.

Available options:

auto,

default,

flex,

priority,

scale,

null

Example:

"auto"

session_id

string

A unique identifier for grouping related requests (e.g., a conversation or agent workflow). When provided, OpenRouter uses it as the sticky routing key, routing all requests in the session to the same provider to maximize prompt cache hits. Also used for observability grouping. If provided in both the request body and the x-session-id header, the body value takes precedence. Maximum of 256 characters.

Maximum string length: 256

stop

Stop sequences (up to 4)

Example:

["\n"]

stop_server_tools_when

object[]

Stop conditions for the server-tool agent loop. Any condition firing halts the loop (OR logic). When set, this overrides max_tool_calls.

Minimum array length: 1

A single condition that, when met, halts the server-tool agent loop.

Option 1
Option 2
Option 3
Option 4
Option 5

Show child attributes

Example:

{ "step_count": 5, "type": "step_count_is" }

Example:

[
  { "step_count": 5, "type": "step_count_is" },
  {
    "max_cost_in_dollars": 0.5,
    "type": "max_cost"
  }
]

stream

boolean

default:false

Enable streaming response

Example:

false

stream_options

object | null

Streaming configuration options

Show child attributes

Example:

{ "include_usage": true }

temperature

number<double> | null

Sampling temperature (0-2)

Example:

0.7

tool_choice

Tool choice configuration

Available options:

none

Example:

"auto"

tools

object[]

Available tools for function calling

Tool definition for function calling (regular function or OpenRouter built-in server tool)

Show child attributes

Example:

{
  "function": {
    "description": "Get the current weather for a location",
    "name": "get_weather",
    "parameters": {
      "properties": {
        "location": {
          "description": "City name",
          "type": "string"
        },
        "unit": {
          "enum": ["celsius", "fahrenheit"],
          "type": "string"
        }
      },
      "required": ["location"],
      "type": "object"
    }
  },
  "type": "function"
}

Example:

[
  {
    "function": {
      "description": "Get weather",
      "name": "get_weather"
    },
    "type": "function"
  }
]

top_a

number<double> | null

Consider only tokens with "sufficiently high" probabilities based on the probability of the most likely token. Not all providers support this parameter.

Example:

0

top_k

integer | null

Limits the model to choose from the top K most likely tokens at each step. A value of 1 means the model will always pick the most likely next token. Not all providers support this parameter.

Example:

40

top_logprobs

integer | null

Number of top log probabilities to return (0-20)

Example:

5

top_p

number<double> | null

Nucleus sampling parameter (0-1)

Example:

1

trace

object

Metadata for observability and tracing. Known keys (trace_id, trace_name, span_name, generation_name, parent_span_id) have special handling. Additional keys are passed through as custom metadata to configured broadcast destinations.

Show child attributes

Example:

{
  "trace_id": "trace-abc123",
  "trace_name": "my-app-trace"
}

user

string

Unique user identifier

Example:

"user-123"

Response

Successful chat completion response

Chat completion response

choices

object[]

required

List of completion choices

Show child attributes

created

integer

required

Unix timestamp of creation

Example:

1677652288

string

required

Unique completion identifier

Example:

"chatcmpl-123"

model

string

required

Model used for completion

Example:

"openai/gpt-4"

object

enum<string>

required

Available options:

chat.completion

system_fingerprint

string | null

required

System fingerprint

Example:

"fp_44709d6fcb"

openrouter_metadata

object

Show child attributes

Example:

{
  "attempt": 1,
  "endpoints": {
    "available": [
      {
        "model": "openai/gpt-4o",
        "provider": "OpenAI",
        "selected": true
      }
    ],
    "total": 1
  },
  "is_byok": false,
  "region": "iad",
  "requested": "openai/gpt-4o",
  "strategy": "direct",
  "summary": "available=1, selected=OpenAI"
}

service_tier

string | null

The service tier used by the upstream provider for this request

Example:

"default"

usage

object

Token usage statistics

Show child attributes

Example:

{
  "completion_tokens": 15,
  "completion_tokens_details": { "reasoning_tokens": 5 },
  "cost": 0.0012,
  "cost_details": {
    "upstream_inference_completions_cost": 0.0004,
    "upstream_inference_cost": null,
    "upstream_inference_prompt_cost": 0.0008
  },
  "is_byok": false,
  "prompt_tokens": 10,
  "prompt_tokens_details": { "cached_tokens": 2 },
  "server_tool_use_details": {
    "tool_calls_executed": 2,
    "tool_calls_requested": 2
  },
  "total_tokens": 25
}