Create an AudioTranscription

Overview

Transcribe audio files using multiple providers (OpenAI Whisper, OpenAI GPT-4o Transcribe, Google Chirp). Supports file upload via multipart form data.

Supported Models

ProviderModels
OpenAIopenai/whisper-1, openai/gpt-4o-transcribe
Googlegoogle/chirp-2

Async Mode

Set async: true to queue the transcription and receive a run UID. Poll GET /v3/ai/transcription_runs/{uid} for the result. Recommended for long audio files or diarization.

Available for Staff tokens

Recent Requests
Log in to see full request history
TimeStatusUser Agent
Retrieving recent requests…
LoadingLoading…
Body Params
file
required

Audio file to transcribe (max 25MB). Supported formats: mp3, mp4, m4a, wav, webm, flac, ogg.

string
required

Model identifier in provider/model format (e.g., "openai/gpt-4o-transcribe", "openai/whisper-1", "google/chirp-2").

string

ISO-639-1 language code for the audio content (e.g., "en" for English, "es" for Spanish, "he" for Hebrew). When omitted, the provider will auto-detect the language.

string

Optional guiding prompt to improve transcription accuracy. Useful for providing context, domain-specific terminology, or correcting recurring misrecognitions.

string
enum

Desired response format. 'json' returns a simple JSON with text, 'text' returns plain text, 'srt' returns SubRip subtitle format, 'verbose_json' returns detailed JSON with word-level timestamps, 'vtt' returns WebVTT subtitle format. Defaults to 'json'.

Allowed:
number
0 to 1

Sampling temperature between 0 and 1. Higher values make the output more random, while lower values make it more deterministic. Defaults to 0.

boolean
Defaults to false

Process the request asynchronously. Returns a run UID immediately that can be polled via GET /v3/ai/transcription_runs/{uid}. Recommended for long audio files or when diarization is needed.

Responses

Language
Credentials
Bearer
JWT
LoadingLoading…
Response
Click Try It! to start a request and see the response here! Or choose an example:
application/json