Pareto Router

Pick a coding model by minimum coding score without choosing a specific model

The Pareto Router (openrouter/pareto-code) is a way to have OpenRouter always pick a strong coding model for your needs without committing to a specific one. You express a single min_coding_score preference between 0 and 1, and the router routes your request to a coding model that meets that bar.

Overview

The name comes from Pareto efficiency: at any given cost or capability point, we route to a coding model that sits on the quality/cost frontier we maintain for OpenRouter-hosted models.

The Pareto Router is tuned for coding use cases. It maintains a curated shortlist of strong coding models currently available on OpenRouter, ranked by widely-used external coding benchmarks. Your min_coding_score picks out how capable the selected model needs to be — higher scores route to stronger (and typically more expensive) models. The exact shortlist and selection logic evolve over time as new models land and benchmarks shift.

Usage

Set your model to openrouter/pareto-code and optionally pass the pareto-router plugin to control the minimum coding score:

1import { OpenRouter } from '@openrouter/sdk';
2
3const openRouter = new OpenRouter({
4 apiKey: '<OPENROUTER_API_KEY>',
5});
6
7const completion = await openRouter.chat.send({
8 model: 'openrouter/pareto-code',
9 plugins: [
10 {
11 id: 'pareto-router',
12 min_coding_score: 0.8,
13 },
14 ],
15 messages: [
16 {
17 role: 'user',
18 content: 'Write a Python function that merges two sorted lists.',
19 },
20 ],
21});
22
23console.log(completion.choices[0].message.content);
24console.log('Model used:', completion.model);

The min_coding_score parameter

min_coding_score is an optional number between 0 and 1, where 1 is best. It sets a floor on how capable the selected model needs to be for your request. Higher scores route to stronger coders at the top of the shortlist; lower scores open up cheaper, faster options.

If you omit min_coding_score, the router defaults to the strongest available coders.

A model is drawn from the matching shortlist on each request. If every model in the matching set is temporarily unavailable, the router falls back to the next-closest set rather than failing the request. The response model field always reports the concrete model that handled the request.

Response

The response includes the model field showing which coding model was actually used:

1{
2 "id": "gen-...",
3 "model": "anthropic/claude-opus-4.7",
4 "choices": [
5 {
6 "message": {
7 "role": "assistant",
8 "content": "..."
9 }
10 }
11 ],
12 "usage": {
13 "prompt_tokens": 42,
14 "completion_tokens": 128,
15 "total_tokens": 170
16 }
17}

How It Works

  1. Shortlist resolution: Your min_coding_score value is used to pick the set of coding models that meet your quality bar.
  2. Candidate filtering: The router filters the shortlist to models that are currently published on OpenRouter.
  3. Selection: A single model is selected from the filtered candidates.
  4. Fallback: If every candidate is unavailable, the router steps through neighboring sets to find a coding-capable model.
  5. Request forwarding: Your request is forwarded to the selected model.

Pricing

The Pareto Router itself adds no fee. You pay only for the underlying model that handles the request. Because model selection varies across the shortlist, per-request cost will vary too. Use a lower min_coding_score when cost is the primary concern.

Limitations

  • Coding only: openrouter/pareto-code is tuned for coding tasks. For other use cases, use a different router or choose a specific model.
  • Model selection may change over time: For a given min_coding_score, the same model is selected deterministically (sorted by price). However, the selected model may change when the underlying shortlist is updated (e.g. new models are added or benchmarks shift). Within a conversation, provider sticky routing keeps your requests on the same provider endpoint to maximize cache hits.
  • Coding score only: min_coding_score is the only router parameter. You can’t directly cap cost or latency per request.