model_api_documentation_gitbook
Comprehensive API reference for all available AI models
Last Updated: 2026-01-19 Total Models: 107 (32 LLM + 75 Multimodal) Providers: 12 (Anthropic, DeepSeek, Google, Microsoft, MiniMax, Mistral AI, Nous Research, OpenAI, Qwen, Vidu, Vtrix, xAI)
Quick Start
Authentication Required
All API requests require authentication using a Bearer token in the Authorization header.
Headers
Authorization: Bearer YOUR_API_KEY
Content-Type: application/jsonObtaining API Key
Sign Up
Sign up for an account at https://vtrix.ai
Navigate to Dashboard
Navigate to the API Keys section in your dashboard
Generate API Key
Generate a new API key
Secure Storage
Copy and securely store your API key
Base URLs
https://cloud.vtrix.ai/llm/chat/completionsUse this endpoint for all Language Learning Models (LLM) including Anthropic, DeepSeek, Google, Microsoft, Mistral AI, Nous Research, OpenAI, Qwen, and xAI models.
https://cloud.vtrix.ai/model/v1/generationUse this endpoint for all Multimodal models including MiniMax, Vidu, and Vtrix image/video generation models.
Important Notes
Keep your API key secure and never share it publicly
Rate limits apply to all endpoints
Ensure sufficient account balance before making requests
API Reference
Anthropic
Claude Sonnet 4.5
Model ID: vtrix-claude-sonnet-4.5
Anthropic's most intelligent Sonnet model. Hybrid reasoning with fast and extended thinking modes. Excels at real-world coding, agents, and computer use with 200K context.
Authentication
Code Examples
Parameters Reference
Parameters
model
string
✅
-
-
Model ID: vtrix-claude-sonnet-4.5
messages
array
✅
-
-
Array of message objects
messages[].role
string
✅
-
user, assistant, system, developer
Message role
messages[].content
string/array
✅
-
-
Text string or multimodal array
max_tokens
integer
❌
-
1-64000
Maximum tokens to generate
temperature
number
❌
1.0
0.0-2.0
Sampling temperature
top_p
number
❌
1.0
0.0-1.0
Nucleus sampling parameter
stream
boolean
❌
false
true, false
Stream response incrementally
Response Format
Error Codes Reference
Error Codes
401
Unauthorized
API key is missing or invalid
403
Forbidden
Insufficient balance or permission denied
429
Too Many Requests
Rate limit exceeded
500
Internal Server Error
An internal server error occurred
Claude Opus 4.5
Model ID: vtrix-claude-opus-4.5
Anthropic's most capable model for complex software engineering and agentic workflows. 80.9% on SWE-bench Verified. Optimized for long-horizon computer use tasks.
Authentication
Code Examples
Parameters Reference
Parameters
model
string
✅
-
-
Model ID: vtrix-claude-opus-4.5
messages
array
✅
-
-
Array of message objects
messages[].role
string
✅
-
user, assistant, system, developer
Message role
messages[].content
string/array
✅
-
-
Text string or multimodal array
max_tokens
integer
❌
-
1-64000
Maximum tokens to generate
temperature
number
❌
1.0
0.0-2.0
Sampling temperature
top_p
number
❌
1.0
0.0-1.0
Nucleus sampling parameter
stream
boolean
❌
false
true, false
Stream response incrementally
Response Format
Error Codes Reference
Error Codes
401
Unauthorized
API key is missing or invalid
403
Forbidden
Insufficient balance or permission denied
429
Too Many Requests
Rate limit exceeded
500
Internal Server Error
An internal server error occurred
Claude 3.5 Haiku
Model ID: vtrix-claude-3.5-haiku
Fast and capable model surpassing Claude 3 Opus on most benchmarks. 40.6% on SWE-bench Verified. Best value for coding assistance and quick tasks.
Authentication
Code Examples
Parameters Reference
Parameters
model
string
✅
-
-
Model ID: vtrix-claude-3.5-haiku
messages
array
✅
-
-
Array of message objects
messages[].role
string
✅
-
user, assistant, system, developer
Message role
messages[].content
string/array
✅
-
-
Text string or multimodal array
max_tokens
integer
❌
-
1-8192
Maximum tokens to generate
temperature
number
❌
1.0
0.0-2.0
Sampling temperature
top_p
number
❌
1.0
0.0-1.0
Nucleus sampling parameter
stream
boolean
❌
false
true, false
Stream response incrementally
Response Format
Error Codes Reference
Error Codes
401
Unauthorized
API key is missing or invalid
403
Forbidden
Insufficient balance or permission denied
429
Too Many Requests
Rate limit exceeded
500
Internal Server Error
An internal server error occurred
Claude Haiku 4.5
Model ID: vtrix-claude-haiku-4.5
Anthropic's most efficient model matching Claude Sonnet 4 performance. >73% on SWE-bench Verified. Optimized for high-throughput coding and computer use tasks.
Authentication
Code Examples
Parameters Reference
Parameters
model
string
✅
-
-
Model ID: vtrix-claude-haiku-4.5
messages
array
✅
-
-
Array of message objects
messages[].role
string
✅
-
user, assistant, system, developer
Message role
messages[].content
string/array
✅
-
-
Text string or multimodal array
max_tokens
integer
❌
-
1-64000
Maximum tokens to generate
temperature
number
❌
1.0
0.0-2.0
Sampling temperature
top_p
number
❌
1.0
0.0-1.0
Nucleus sampling parameter
stream
boolean
❌
false
true, false
Stream response incrementally
Response Format
Error Codes Reference
Error Codes
401
Unauthorized
API key is missing or invalid
403
Forbidden
Insufficient balance or permission denied
429
Too Many Requests
Rate limit exceeded
500
Internal Server Error
An internal server error occurred
Claude Sonnet 4
Model ID: vtrix-claude-sonnet-4
Major upgrade to Sonnet 3.7 with 72.7% on SWE-bench Verified. Ideal balance of capability and efficiency for everyday coding and business tasks.
Authentication
Code Examples
Parameters Reference
Parameters
model
string
✅
-
-
Model ID: vtrix-claude-sonnet-4
messages
array
✅
-
-
Array of message objects
messages[].role
string
✅
-
user, assistant, system, developer
Message role
messages[].content
string/array
✅
-
-
Text string or multimodal array
max_tokens
integer
❌
-
1-64000
Maximum tokens to generate
temperature
number
❌
1.0
0.0-2.0
Sampling temperature
top_p
number
❌
1.0
0.0-1.0
Nucleus sampling parameter
stream
boolean
❌
false
true, false
Stream response incrementally
Response Format
Error Codes Reference
Error Codes
401
Unauthorized
API key is missing or invalid
403
Forbidden
Insufficient balance or permission denied
429
Too Many Requests
Rate limit exceeded
500
Internal Server Error
An internal server error occurred
Claude 3 Haiku
Model ID: vtrix-claude-3-haiku
Anthropic's fastest model for near-instant responses. Ideal for lightweight tasks, customer service, and content moderation at the lowest cost in the Claude family.
Authentication
Code Examples
Parameters Reference
Parameters
model
string
✅
-
-
Model ID: vtrix-claude-3-haiku
messages
array
✅
-
-
Array of message objects
messages[].role
string
✅
-
user, assistant, system, developer
Message role
messages[].content
string/array
✅
-
-
Text string or multimodal array
max_tokens
integer
❌
-
1-4096
Maximum tokens to generate
temperature
number
❌
1.0
0.0-2.0
Sampling temperature
top_p
number
❌
1.0
0.0-1.0
Nucleus sampling parameter
stream
boolean
❌
false
true, false
Stream response incrementally
Response Format
Error Codes Reference
Error Codes
401
Unauthorized
API key is missing or invalid
403
Forbidden
Insufficient balance or permission denied
429
Too Many Requests
Rate limit exceeded
500
Internal Server Error
An internal server error occurred
DeepSeek
DeepSeek V3
Model ID: vtrix-deepseek-v3-0324
671B parameter MoE model with 37B active parameters. Supports thinking and non-thinking modes with FP8 training. Strong reasoning, coding, and Chinese language capabilities at extremely low cost.
Authentication
Code Examples
Parameters Reference
Parameters
model
string
✅
-
-
Model ID: vtrix-deepseek-v3-0324
messages
array
✅
-
-
Array of message objects
messages[].role
string
✅
-
user, assistant, system, developer
Message role
messages[].content
string/array
✅
-
-
Text string or multimodal array
max_tokens
integer
❌
-
1-16384
Maximum tokens to generate
temperature
number
❌
1.0
0.0-2.0
Sampling temperature
top_p
number
❌
1.0
0.0-1.0
Nucleus sampling parameter
stream
boolean
❌
false
true, false
Stream response incrementally
Response Format
Error Codes Reference
Error Codes
401
Unauthorized
API key is missing or invalid
403
Forbidden
Insufficient balance or permission denied
429
Too Many Requests
Rate limit exceeded
500
Internal Server Error
An internal server error occurred
DeepSeek V3.1
Model ID: vtrix-deepseek-v3.1
Enhanced V3 with improved tool use and code generation. Two-phase long-context training for better coherence. Best-in-class efficiency at 671B/37B architecture.
Authentication
Code Examples
Parameters Reference
Parameters
model
string
✅
-
-
Model ID: vtrix-deepseek-v3.1
messages
array
✅
-
-
Array of message objects
messages[].role
string
✅
-
user, assistant, system, developer
Message role
messages[].content
string/array
✅
-
-
Text string or multimodal array
max_tokens
integer
❌
-
1-16384
Maximum tokens to generate
temperature
number
❌
1.0
0.0-2.0
Sampling temperature
top_p
number
❌
1.0
0.0-1.0
Nucleus sampling parameter
stream
boolean
❌
false
true, false
Stream response incrementally
Response Format
Error Codes Reference
Error Codes
401
Unauthorized
API key is missing or invalid
403
Forbidden
Insufficient balance or permission denied
429
Too Many Requests
Rate limit exceeded
500
Internal Server Error
An internal server error occurred
DeepSeek R1
Model ID: vtrix-deepseek-r1
Open-source reasoning model matching OpenAI o1. 671B/37B MoE with MIT license. Transparent reasoning tokens for explainable AI applications.
Authentication
Code Examples
Parameters Reference
Parameters
model
string
✅
-
-
Model ID: vtrix-deepseek-r1
messages
array
✅
-
-
Array of message objects
messages[].role
string
✅
-
user, assistant, system, developer
Message role
messages[].content
string/array
✅
-
-
Text string or multimodal array
max_tokens
integer
❌
-
1-32768
Maximum tokens to generate
temperature
number
❌
1.0
0.0-2.0
Sampling temperature
top_p
number
❌
1.0
0.0-1.0
Nucleus sampling parameter
stream
boolean
❌
false
true, false
Stream response incrementally
Response Format
Error Codes Reference
Error Codes
401
Unauthorized
API key is missing or invalid
403
Forbidden
Insufficient balance or permission denied
429
Too Many Requests
Rate limit exceeded
500
Internal Server Error
An internal server error occurred
Google
Gemini 2.5 Flash
Model ID: vtrix-gemini-2.5-flash
Google's workhorse model with hybrid reasoning. Excels at coding, math, science with configurable thinking depth. 1M context, native tool use, and multimodal capabilities at an affordable price point.
Authentication
Code Examples
Parameters Reference
Parameters
model
string
✅
-
-
Model ID: vtrix-gemini-2.5-flash
messages
array
✅
-
-
Array of message objects
messages[].role
string
✅
-
user, assistant, system, developer
Message role
messages[].content
string/array
✅
-
-
Text string or multimodal array
max_tokens
integer
❌
-
1-65535
Maximum tokens to generate
temperature
number
❌
1.0
0.0-2.0
Sampling temperature
top_p
number
❌
1.0
0.0-1.0
Nucleus sampling parameter
stream
boolean
❌
false
true, false
Stream response incrementally
Response Format
Error Codes Reference
Error Codes
401
Unauthorized
API key is missing or invalid
403
Forbidden
Insufficient balance or permission denied
429
Too Many Requests
Rate limit exceeded
500
Internal Server Error
An internal server error occurred
Gemini 2.5 Pro
Model ID: vtrix-gemini-2.5-pro
Google's state-of-the-art reasoning model. #1 on LMArena leaderboard. Excels at coding, math, and science with 1M context and native multimodal understanding.
Authentication
Code Examples
Parameters Reference
Parameters
model
string
✅
-
-
Model ID: vtrix-gemini-2.5-pro
messages
array
✅
-
-
Array of message objects
messages[].role
string
✅
-
user, assistant, system, developer
Message role
messages[].content
string/array
✅
-
-
Text string or multimodal array
max_tokens
integer
❌
-
1-65536
Maximum tokens to generate
temperature
number
❌
1.0
0.0-2.0
Sampling temperature
top_p
number
❌
1.0
0.0-1.0
Nucleus sampling parameter
stream
boolean
❌
false
true, false
Stream response incrementally
Response Format
Error Codes Reference
Error Codes
401
Unauthorized
API key is missing or invalid
403
Forbidden
Insufficient balance or permission denied
429
Too Many Requests
Rate limit exceeded
500
Internal Server Error
An internal server error occurred
Gemini 3 Pro Preview
Model ID: vtrix-gemini-3-pro-preview
Google's frontier model for high-precision multimodal reasoning. Supports text, image, video, and audio inputs with 1M context. State-of-the-art on complex reasoning benchmarks.
Authentication
Code Examples
Parameters Reference
Parameters
model
string
✅
-
-
Model ID: vtrix-gemini-3-pro-preview
messages
array
✅
-
-
Array of message objects
messages[].role
string
✅
-
user, assistant, system, developer
Message role
messages[].content
string/array
✅
-
-
Text string or multimodal array
max_tokens
integer
❌
-
1-65536
Maximum tokens to generate
temperature
number
❌
1.0
0.0-2.0
Sampling temperature
top_p
number
❌
1.0
0.0-1.0
Nucleus sampling parameter
stream
boolean
❌
false
true, false
Stream response incrementally
Response Format
Error Codes Reference
Error Codes
401
Unauthorized
API key is missing or invalid
403
Forbidden
Insufficient balance or permission denied
429
Too Many Requests
Rate limit exceeded
500
Internal Server Error
An internal server error occurred
Gemini 2.0 Flash
Model ID: vtrix-gemini-2.0-flash
Google's fast multimodal model with native audio understanding. Significantly faster TTFT than Flash 1.5. Ideal for real-time applications and streaming.
Authentication
Code Examples
Parameters Reference
Parameters
model
string
✅
-
-
Model ID: vtrix-gemini-2.0-flash
messages
array
✅
-
-
Array of message objects
messages[].role
string
✅
-
user, assistant, system, developer
Message role
messages[].content
string/array
✅
-
-
Text string or multimodal array
max_tokens
integer
❌
-
1-8192
Maximum tokens to generate
temperature
number
❌
1.0
0.0-2.0
Sampling temperature
top_p
number
❌
1.0
0.0-1.0
Nucleus sampling parameter
stream
boolean
❌
false
true, false
Stream response incrementally
Response Format
Error Codes Reference
Error Codes
401
Unauthorized
API key is missing or invalid
403
Forbidden
Insufficient balance or permission denied
429
Too Many Requests
Rate limit exceeded
500
Internal Server Error
An internal server error occurred
Microsoft
WizardLM-2 8x22B
Model ID: vtrix-wizardlm-2-8x22b
Microsoft's most advanced Wizard model based on Mixtral 8x22B. Strong performance on reasoning and instruction-following benchmarks.
Authentication
Code Examples
Parameters Reference
Parameters
model
string
✅
-
-
Model ID: vtrix-wizardlm-2-8x22b
messages
array
✅
-
-
Array of message objects
messages[].role
string
✅
-
user, assistant, system, developer
Message role
messages[].content
string/array
✅
-
-
Text string or multimodal array
max_tokens
integer
❌
-
1-8192
Maximum tokens to generate
temperature
number
❌
1.0
0.0-2.0
Sampling temperature
top_p
number
❌
1.0
0.0-1.0
Nucleus sampling parameter
stream
boolean
❌
false
true, false
Stream response incrementally
Response Format
Error Codes Reference
Error Codes
401
Unauthorized
API key is missing or invalid
403
Forbidden
Insufficient balance or permission denied
429
Too Many Requests
Rate limit exceeded
500
Internal Server Error
An internal server error occurred
MiniMax
Hailuo 23 Image to Video
Model ID: minimax_hailuo_23_i2v
Hailuo 2.3 I2V generates high-quality animated videos from images with enhanced resolution and motion quality.
Authentication
Code Examples
Parameters Reference
Parameters
model
string
✅
Model ID: minimax_hailuo_23_i2v
first_frame_image
string
✅
Input Image URL or Base64 Encoded (data:image/jpeg;base 64,xxx Format)
prompt
string
❌
Video generation prompt. Supports camera motion: [Pan left], [Zoom in], etc. Max 2000 characters.
duration
integer
❌
Video Duration (Seconds), Default 6 Seconds
resolution
string
❌
Video Resolution, Defaultas 768 P
prompt_optimizer
boolean
❌
Whether to Enable Promptoptimize, Defaultastrue
Response Format
Error Codes Reference
Error Codes
401
API key is missing or invalid
403
Insufficient balance or permission denied
429
Rate limit exceeded
500
Internal server error
(Documentation continues for many models and their parameters, code examples, response formats, and error codes. Each model section follows the same structure: Model Name, Model ID, brief description, Authentication block, Code Examples (tabs), Parameters Reference (expandable), Response Format, and Error Codes (expandable).)
Mistral AI
(See above pattern for model sections: Mistral Small 3.2 24B, Mistral Nemo, Mistral Small 3.1 24B, Devstral 2512, Devstral 2512 (Free).)
Nous Research
(See Hermes 3 405B Instruct, Hermes 4 405B, Hermes 4 70B sections.)
OpenAI
(See GPT-4o, GPT-4.1, GPT-4o Mini, GPT-5, GPT-4.1 Mini, OpenAI o1, o3, GPT-4.1 Nano, Sora 2, and others. Each entry includes Authentication, Code Examples, Parameters, Response Format, and Error Codes.)
Qwen
(See Qwen3 235B A22B Instruct 2507 section.)
Vidu
(See Vidu model sections: Vidu Q2 Pro Start End, Vidu Q2 Turbo Start End, Vidu Q2, Vidu Q1 I2V, Vidu Q1, and many more. Each follows the same structured format.)
Vtrix
(See Vtrix model sections: Ultra Video Extend, Ultra Lipsync, Vtrix Edit Transform, Ultra I2V Master, Vtrix Multi Ip, Ultra Video Turbo, Vtrix Image 4.5 Blend, Vtrix Edit IP, Vtrix Edit Base, Vtrix Edit Portrait, Ultra I2V Pro, Ultra Video Pro, Vtrix Motion 1.5 Pro, Vtrix Motion Pro, Vtrix Motion Base, Film Avatar Motion, Film Avatar Omni, Ultra Image Omni, Film Actor, Ultra Video Omni, Ultra I2V Turbo, Ultra I2V 1.5, Ultra Video 1.5, Vtrix Image 4.5, Ultra Video Master 2.1, Vtrix Image 4.0, Vtrix Motion Turbo, Ultra Video Plus, Ultra I2V 1.6, Vtrix Motion I2V, Ultra I2V Master 2.1, Ultra I2V 2.1, Ultra Video Master, Vtrix Image 3.0, Vtrix Edit Style, Vtrix Edit 3D, Vtrix 3D, Vtrix Edit I2I, Ultra Effects Multi 1.6, Film Image Pro, Film Image Base, Ultra Effects Multi 1.5, Ultra Effects Multi 1.0, Ultra Effects Solo, Ultra I2V Base, Film I2I, Ultra Video Base.)
xAI
Grok 4
Model ID: vtrix-grok-4
xAI's flagship reasoning model with mandatory thinking mode. 256K context, parallel tool calling, and real-time web search integration. Excels at technical reasoning.
Authentication
Code Examples
Parameters Reference
Parameters
model
string
✅
-
-
Model ID: vtrix-grok-4
messages
array
✅
-
-
Array of message objects
messages[].role
string
✅
-
user, assistant, system, developer
Message role
messages[].content
string/array
✅
-
-
Text string or multimodal array
max_tokens
integer
❌
-
1-32768
Maximum tokens to generate
temperature
number
❌
1.0
0.0-2.0
Sampling temperature
top_p
number
❌
1.0
0.0-1.0
Nucleus sampling parameter
stream
boolean
❌
false
true, false
Stream response incrementally
Response Format
Error Codes Reference
Error Codes
401
Unauthorized
API key is missing or invalid
403
Forbidden
Insufficient balance or permission denied
429
Too Many Requests
Rate limit exceeded
500
Internal Server Error
An internal server error occurred
If you want, I can:
Convert selected model sections into separate GitBook pages.
Create a condensed reference table for Model IDs and endpoints.
Extract and produce quick curl snippets for a chosen subset of models.