post
https://api.vcita.biz/v3/ai/chat_completions
Overview
Create chat completions using multiple LLM providers (OpenAI, Anthropic, Google). Supports text and multimodal input (images, audio, video, files), streaming responses via SSE, async mode for long-running thinking models, tool calling (function calling), and structured output (JSON schema).
Supported Models
| Provider | Models |
|---|---|
| OpenAI | openai/gpt-5, openai/gpt-4o, openai/gpt-4o-mini, openai/o3-mini, openai/o1 |
| Anthropic | anthropic/claude-sonnet-4-5-20250929, anthropic/claude-haiku-4-5-20251001, anthropic/claude-3-5-sonnet-latest, anthropic/claude-3-5-haiku-latest |
google/gemini-2.5-pro, google/gemini-2.0-flash, google/gemini-2.0-pro |
Streaming
By default, responses are streamed as Server-Sent Events (SSE). Set stream: false for a single JSON response.
Async Mode
Set async: true to queue the request and receive a run UID. Poll GET /v3/ai/chat_completion_runs/{uid} for the result. Recommended for thinking models.
Available for Staff tokens
