Audio Inputs
Send audio files to compatible models for transcription, analysis, and processing. Audio input requests use the/api/v1/chat/completions API with the input_audio content type. Audio files must be base64-encoded and include the format specification.
Note: Audio files must be base64-encoded - direct URLs are not supported for audio content.
You can search for models that support audio input by filtering to audio input modality on our Models page.
Sending Audio Files
Here’s how to send an audio file for processing:Supported Audio Input Formats
Supported audio formats vary by provider. Common formats include:wav- WAV audiomp3- MP3 audioaiff- AIFF audioaac- AAC audioogg- OGG Vorbis audioflac- FLAC audiom4a- M4A audiopcm16- PCM16 audiopcm24- PCM24 audio
Audio Output
OpenRouter supports receiving audio responses from models that have audio output capabilities. To request audio output, include themodalities and audio parameters in your request.
You can search for models that support audio output by filtering to audio output modality on our Models page.
Requesting Audio Output
To receive audio output, setmodalities to ["text", "audio"] and provide the audio configuration with your desired voice and format:
Streaming Chunk Format
Audio output requires streaming (stream: true). Audio data and transcript are delivered incrementally via the delta.audio field in each chunk:
Audio Configuration Options
Theaudio parameter accepts the following options:
| Option | Description |
|---|---|
voice | The voice to use for audio generation (e.g., alloy, echo, fable, onyx, nova, shimmer). Available voices vary by model. |
format | The audio format for the output (e.g., wav, mp3, flac, opus, pcm16). Available formats vary by model. |