Create an AudioTranscription

post

https://api.vcita.biz/v3/ai/audio_transcriptions

Overview

Transcribe audio files using multiple providers (OpenAI Whisper, OpenAI GPT-4o Transcribe, Google Chirp). Supports file upload via multipart form data.

Supported Models

Provider	Models
OpenAI	`openai/whisper-1`, `openai/gpt-4o-transcribe`
Google	`google/chirp-2`

Async Mode

Set async: true to queue the transcription and receive a run UID. Poll GET /v3/ai/transcription_runs/{uid} for the result. Recommended for long audio files or diarization.

Available for Staff tokens

Recent Requests

Time	Status	User Agent
Retrieving recent requests…

Loading…

Body Params

file

required

Audio file to transcribe (max 25MB). Supported formats: mp3, mp4, m4a, wav, webm, flac, ogg.

model

string

required

Model identifier in provider/model format (e.g., "openai/gpt-4o-transcribe", "openai/whisper-1", "google/chirp-2").

language

string

ISO-639-1 language code for the audio content (e.g., "en" for English, "es" for Spanish, "he" for Hebrew). When omitted, the provider will auto-detect the language.

prompt

string

Optional guiding prompt to improve transcription accuracy. Useful for providing context, domain-specific terminology, or correcting recurring misrecognitions.

response_format

string

enum

Desired response format. 'json' returns a simple JSON with text, 'text' returns plain text, 'srt' returns SubRip subtitle format, 'verbose_json' returns detailed JSON with word-level timestamps, 'vtt' returns WebVTT subtitle format. Defaults to 'json'.

Allowed:

temperature

number

0 to 1

Sampling temperature between 0 and 1. Higher values make the output more random, while lower values make it more deterministic. Defaults to 0.

async

boolean

Defaults to false

Process the request asynchronously. Returns a run UID immediately that can be polled via GET /v3/ai/transcription_runs/{uid}. Recommended for long audio files or when diarization is needed.

Responses

Language

Credentials

Bearer

JWT

Loading…

Response

Click Try It! to start a request and see the response here! Or choose an example:

application/json

200Successful transcription.

202Async mode: job queued successfully. Poll the returned run UID for the result.

400Bad Request - The request is malformed or contains invalid syntax.

401Unauthorized - The bearer token is missing, expired, or invalid.

413Payload Too Large - The uploaded audio file exceeds the 25MB size limit.

422Unprocessable Entity - Validation error with valid syntax.

429Too Many Requests - Rate limit exceeded (10 requests per minute).