Complete reference for all Gab AI API endpoints: chat completions (with vision / image input), file uploads, image generation, video generation, text-to-speech, embeddings, and more.
https://gab.ai/v1
Generate a response from an AI model based on a conversation history. This is the primary endpoint for text generation and follows the OpenAI chat completions format. You can pass a tools array and optional tool_choice in the request body (see the table above). Supported models can then request one or more tool calls; you run the tools and send the results back in follow-up requests. Tools are only used when you send tools and the model supports function calling (see model capabilities below). Assistant messages in messages may include tool_calls: an array of objects with id, type: "function", and function: { name, arguments }. Tool result messages use role: "tool" and must include tool_call_id (matching the id of the corresponding tool call) and content (the tool’s result string). choices[0].message may include content (string or null) and/or tool_calls (array with the same shape: id, type, function.name, function.arguments). choices[0].finish_reason may be "tool_calls" when the model is requesting tool use (in addition to "stop", "length", etc.). Stream chunks may include delta.tool_calls (array of tool call deltas). The final chunk may have finish_reason: "tool_calls" when the model is requesting tool use. Send messages plus optional tools and tool_choice. If the response has finish_reason: "tool_calls" and message.tool_calls, run each tool, append the assistant message (including its tool_calls) to messages, append one message per result with role "tool", tool_call_id, and content, then call the API again with the updated messages. Repeat until finish_reason is "stop" or another non–tool-call reason. Vision-capable models accept images in the messages array using OpenAI's multimodal content format: a user message's content can be an array of parts, where each part is either {'{ type: "text", text }'} or {'{ type: "image_url", image_url: { url } }'}. The image URL may be a public HTTPS URL, a base64 data URL, or a { file_id } returned by /v1/files. Use /v1/models to discover vision-capable models — look for capabilities.image_input: true. Passing images to a model that does not support image input returns a 400 unsupported_image_input error. Pass response_format to constrain output to valid JSON, either freely (json_object) or against a strict JSON Schema (json_schema). The shape matches OpenAI's structured-outputs API exactly. POST /chat/completions
Model ID (e.g., "arya", "gpt-5-5", "claude-opus-4-8", "gemini-3-5-flash"). Must be a slug returned by GET /v1/models — names like "Claude Opus 4.8" or "claude opus" will not match.
Array of message objects with role and content. Assistant messages may include tool_calls; tool result messages use role "tool" with tool_call_id and content.
Sampling temperature (0-2, default 1)
Maximum tokens to generate
Enable streaming responses (default false)
Nucleus sampling parameter (0-1)
List of tools the model may call. Each item has type: "function" and a function object with name, description, and parameters (JSON Schema). Same structure as OpenAI's tools parameter.
How the model should use tools: "auto" (let the model decide), "none" (disable tools), or { type: "function", function: { name: "<tool_name>" } } to force a specific tool.
Structured output mode. Supports OpenAI's shape: { type: "text" } (default), { type: "json_object" }, or { type: "json_schema", json_schema: { name, strict?, schema } } where schema is a JSON Schema. Best results on models with capabilities.function_calling: true.
The OpenAI tool-call shape works identically across every supported provider — arya, gpt-*, claude-*, gemini-*, deepseek-*, kimi-*, qwen-*, etc. We translate to each provider's native format (e.g., Anthropic's tool_use/ tool_result blocks) on the way out and back. You always send and receive OpenAI-style tool_calls / role: "tool" messages, so existing OpenAI SDKs and agent frameworks (LangChain, Vercel AI SDK, OpenCode, LlamaIndex, …) work unchanged.
All responses include a usage object with token counts and credits_used showing how many credits were deducted for the request.
Use a public URL when the image is already hosted; use a base64 data URL for small, one-off images; and use /v1/files + file_id for larger images or anything you'll reference across multiple requests (avoids re-uploading and keeps request bodies small).
Structured outputs work best on models that advertise capabilities.function_calling: true. A few providers don't support strict json_schema — if the upstream rejects the schema you'll receive a 400 error rather than a silent fallback.
Gab AI supports OpenAI's newer Responses API shape for Codex-style agent clients and SDKs that use input / output items instead of chat messages. The endpoint maps Responses API function calls onto the same tool-calling pipeline as chat completions. Streaming responses emit Responses-style events such as response.created, response.output_text.delta, response.function_call_arguments.done, and response.completed. POST /responses
Model ID, such as "arya", "gpt-5-5", or "claude-sonnet-4-5".
A plain prompt string or Responses API input items: message, function_call, and function_call_output.
Maximum output tokens. Also accepts max_tokens for compatibility.
Sampling temperature.
Nucleus sampling parameter.
Enable Responses-style server-sent events.
Responses API function tools: { type: "function", name, description, parameters }.
"auto", "none", "required", or { type: "function", name: "tool_name" }.
Use https://gab.ai/v1 as the OpenAI base URL and your Gab AI API key as OPENAI_API_KEY. Clients that call /v1/responses can use function tools, streaming, and function call outputs.
Gab AI also exposes an Anthropic-compatible Messages API for tools and SDKs that expect Anthropic's native format, including Claude Code. Use the root base URL https://gab.ai for Anthropic-compatible clients because they append /v1/messages themselves. Anthropic-compatible clients may call /v1/messages/count_tokens to estimate input tokens before sending a message. The response returns input_tokens. POST /messages POST /messages/count_tokens
Model ID. You may use a Gab slug such as "claude-sonnet-4-5" or the provider model name such as "claude-sonnet-4-5-20250929".
Anthropic messages array. User/assistant messages use content blocks; assistant tool requests use type "tool_use"; tool results are user messages with type "tool_result".
Anthropic system prompt. String content and text blocks are supported.
Maximum tokens to generate. Forwarded to Claude-compatible models.
Sampling temperature.
Nucleus sampling parameter.
Top-k sampling parameter for Anthropic models.
Enable Anthropic-style server-sent events.
Anthropic tool definitions with name, description, and input_schema. These are translated to Gab AI's internal tool-call format.
Anthropic tool choice, such as { type: "auto" }, { type: "any" }, { type: "none" }, or { type: "tool", name: "tool_name" }.
Set ANTHROPIC_BASE_URL to https://gab.ai and ANTHROPIC_AUTH_TOKEN to your Gab AI API key. Do not include /v1 in the base URL for Claude Code.
Set stream: true to receive responses as server-sent events (SSE) for real-time output:
The OpenAI SDK handles streaming automatically. Just set stream: true and iterate over the response.
Generate vector embeddings for text inputs. Embeddings are useful for semantic search, retrieval-augmented generation (RAG), clustering, and classification tasks. The endpoint shape mirrors OpenAI's /v1/embeddings. POST /embeddings
Embedding model ID — must be a slug from /v1/models?type=embedding.
Text to embed. A single string or an array of strings (max 2048 items).
Output vector dimensions. Forwarded to the upstream provider for models that support reduced dimensionality.
Encoding format: "float" (default) or "base64"
We don't have any embedding models enabled in production at the moment, so this endpoint will return 400 invalid_model until we re-enable one. Check GET /v1/models?type=embedding for the authoritative list — when it returns a non-empty data array, use one of those id values as the model field.
Pass an array of strings as input to embed multiple texts in a single request (up to 2048 items). This is significantly faster than making individual requests.
Check your current credit balance including monthly allotment and purchased credits. GET /credits
Monthly credits reset at the start of each billing cycle. Purchased credits never expire and are used after monthly credits are exhausted.
Fetch a per-request log of your API usage. Each record is a single billable transaction with its own tx_id, model, token counts and credit cost — useful for auditing billing, building dashboards, or reconciling spend. Records are returned newest-first. GET /usage
tx_id cursor — return records strictly newer than this one. When polling, pass the previous page's first_id (the newest tx in that response, same as data[0].id) — not last_id.
tx_id cursor — return records strictly older than this one. Use to walk backward through history.
Inclusive lower bound on created time. ISO 8601 (e.g. "2026-04-01") or unix timestamp (seconds or milliseconds).
Inclusive upper bound on created time. Same formats as start_date.
Filter to a specific endpoint, e.g. "/v1/chat/completions".
Filter to a specific model slug (e.g. "arya").
Records per page. 1–1000, default 100.
To efficiently tail new activity, persist the first_id from the previous response (the newest tx_id in that page) and pass it as after on the next call. Do not use last_id here: that field is the oldest item in the page, since results are returned newest-first.
List all available models. Optionally filter by type to get only text, image, video, audio, or embedding models. GET /models
Filter by model type: "text", "image", "video", "audio", or "embedding"
The capabilities object lists boolean flags for every feature the model supports. Notable flags for chat: text, streaming, function_calling, thinking, web_search, image_input, file_input. Pass tools only to models with function_calling: true — otherwise the request fails with unsupported_tool_calling.
Generate images from text prompts using various AI image models. Returns URLs to the generated images. POST /images/generations
Image model slug (e.g., "gpt-image-2", "gpt-image-1", "gpt-image-1-mini", "imagen-4-0", "nano-banana-2", "seedream-4-5", "seedream-4-0", "qwen-image-2", "qwen-image", "gemini-2-5-flash-image"). Defaults to "gpt-image-1-mini". Use GET /v1/models?type=image for the live list.
Text description of the image to generate. Maximum length is model-specific (characters). If exceeded, the API returns 400 with code prompt_too_long. Use GET /v1/models?type=image and max_prompt_characters on each model (e.g. image-generator: 2,048; gpt-image-2: 32,000).
Number of images to generate (1-4, default 1). Some models cap n=1 internally.
Image size, model-dependent. Common values: "1024x1024", "1792x1024", "1024x1792".
Quality level: "standard" (default) or "hd". Forwarded to providers that support it.
Use /v1/models?type=image to see all available image generation models, their capabilities, and max_prompt_characters per model. Long structured text (recipes, articles) often exceeds smaller models — summarize to a short visual description or use gpt-image-2.
Upload a file once and reference it by file_id in other API calls — most commonly as an image input to /v1/chat/completions. Files are scoped to the authenticated user (the owner of the API key) and are not accessible to other users. The response shape is OpenAI-compatible. Upload a file using multipart/form-data. Returns a file object with an id you can use as file_id in subsequent requests. List files uploaded by the current API key's user. Optionally filter by purpose. Retrieve metadata for a single file. Delete a file. Removes the stored object and marks the file as deleted. Any outstanding references (e.g., file_id in a chat message) will fail with file_not_found after deletion. POST /files GET /files GET /files/:fileId DELETE /files/:fileId
The binary file to upload (multipart field "file").
What the file will be used for: "vision" (image for chat), "assistants", "user_data" (default), "batch", or "fine-tune". Purpose "vision" requires an image file.
Page size (1-100, default 20).
Filter by purpose (e.g., "vision").
Per-file upload limit is 50 MB. Individual types have their own caps in downstream processing (e.g., Whisper transcription is capped at 25 MB). Image types accepted: PNG, JPEG, GIF, WebP, HEIC, SVG.
Generate videos from text prompts. Video generation is asynchronous—you'll receive a job ID and poll for the result. Start a video generation job. Returns a job ID to poll for completion. Check the status of a video generation job. Poll this endpoint until status is "completed" or "failed". POST /videos/generations GET /videos/:jobId
Video model slug (e.g., "veo-3-1", "veo-3-1-fast", "sora-2", "kling-3-0-pro", "kling-2-5-turbo-pro", "hailuo-2-3-pro", "wan-2-5", "seedance-2-0"). Defaults to "veo-3-1-fast". Use GET /v1/models?type=video for the live list.
Text description of the video to generate
Video duration in seconds (model dependent)
Aspect ratio: "16:9", "9:16", "1:1"
Video generation typically takes 1-3 minutes depending on duration and model. Implement exponential backoff when polling to avoid rate limits.
Convert text to natural-sounding speech. Unlike OpenAI's TTS endpoint, Gab AI returns a JSON object with a CDN URL to the generated audio (rather than streaming raw audio bytes). You can fetch the URL to download the file. POST /audio/speech
TTS model slug. Currently "gpt-4o-mini-tts" (the default if omitted). Use GET /v1/models?type=audio to discover others as we add them.
The text to convert to speech (max 4096 chars)
Voice ID. For gpt-4o-mini-tts: "alloy" (default), "ash", "ballad", "coral", "echo", "fable", "nova", "onyx", "sage", "shimmer", "verse".
Audio format: "mp3" (default), "opus", "aac", "flac".
Speaking speed (0.25 to 4.0, default 1.0)
Optional voice/style instructions forwarded to the model (e.g., "Speak in a calm, slow voice").
OpenAI's client.audio.speech.create(...) expects raw audio bytes back, so calling Gab AI through the OpenAI Python SDK with response.stream_to_file() will not work. Use raw HTTP (or fetch the returned url after parsing JSON) instead.
Programmatically manage your API keys. List existing keys, create new ones, or revoke keys that are no longer needed. List all API keys associated with your account. Keys are returned with masked values for security. Create a new API key. The full key is only shown once in the response—store it securely. Revoke an API key. The key will immediately stop working for all requests. GET /api-keys POST /api-keys DELETE /api-keys/:keyId
A friendly name for the key (e.g., "Production", "Dev Server")
Keys created via this endpoint use the prefix gab- (followed by 64 hex chars). Keys created via the in-app Settings UI use the prefix gab_ (followed by 32 hex chars). Both forms are valid as Authorization: Bearer tokens — treat the entire string as opaque.
The full API key is only returned once when created. Store it securely—you won't be able to retrieve it again.
You cannot delete the API key being used to authenticate the request — the server returns 400 cannot_delete_current_key. Use a different active key (or the dashboard) to revoke it. Deletion is a soft delete: the key is marked inactive and rejected from then on.
Download a portable archive of every record we hold for the API key's owner — profile, conversations, memories, files, custom agents, collections, voice sessions, bookmark folders, automated tasks, purchases, credit purchases, referrals, and feedback. This is the same data that powers the in-app Settings → Download your data button, exposed here for backup tools, GDPR / CCPA workflows, and self-hosted analytics pipelines. The ZIP wraps the same data in a structure that's easy to browse without writing any code. Schema version is pinned in the manifest so consumers can detect breaking changes. GET /account/export
Either "json" (default for the developer API — one big JSON document) or "zip" (a user-friendly archive with per-collection JSON files, individual conversation markdown files, a README, and a manifest).
When set to "false", conversations are returned without their (often very large) messages array. Defaults to true so you get the full archive.
Deleted records and temporary ("incognito") chats are filtered out before we build the archive. We also strip credentials (passwords, password reset tokens, MFA secrets, backup codes), payment-processor IDs (Authorize.net / Valmar customer + payment-method IDs), memory embedding vectors, files attached to temporary chats, and the secret value of API keys. Everything else that belongs to your account is in the export.
Each collection is capped at a generous limit (5,000 conversations, 10,000 files, etc.). If a collection trips its cap the JSON response includes the key in the top-level truncated object (and the ZIP's README spells it out). Email support@gab.ai for a complete offline archive if you need the full history.
The export endpoint is intentionally heavy. The same daily API rate limit applies, but in practice you should only need a handful of calls — cache the result locally rather than polling.
The messages array in chat completions supports these roles:
Sets the behavior and context for the assistant. API requests do NOT inherit the API key owner's profile (name, location, etc.), so if you want the assistant to know who the end user is, describe them here (e.g. "You are chatting with Jane from Austin.").
The human user's messages
Previous AI responses (for context). May include tool_calls: array of { '{ id, type: "function", function: { name, arguments } }' } when using tools.
Tool result messages. Must include tool_call_id (matching the tool call id) and content (the tool’s result string).
All errors follow a consistent format with an error object containing message, type, and code: The error.code field gives a stable machine-readable identifier independent of the HTTP status. The most common ones:
All API responses include rate limit headers to help you manage your usage:
When you receive a 429 error, check the X-RateLimit-Reset header to know when you can resume making requests.
const response = await fetch('https://gab.ai/v1/chat/completions', {
method: 'POST',
headers: {
'Content-Type': 'application/json',
'Authorization': 'Bearer YOUR_API_KEY'
},
body: JSON.stringify({
model: 'arya',
messages: [
{ role: 'system', content: 'You are a helpful assistant.' },
{ role: 'user', content: 'What is the capital of France?' }
],
temperature: 0.7,
max_tokens: 1000
})
});
{
"id": "chatcmpl-abc123",
"object": "chat.completion",
"created": 1704067200,
"model": "arya",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "The capital of France is Paris."
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 25,
"completion_tokens": 12,
"total_tokens": 37,
"credits_used": 1
}
}
{
"model": "arya",
"messages": [
{ "role": "user", "content": "What is the weather in Paris?" }
],
"tools": [
{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get the current weather for a location.",
"parameters": {
"type": "object",
"properties": {
"location": { "type": "string", "description": "City name" }
},
"required": ["location"]
}
}
}
],
"tool_choice": "auto"
}
{
"id": "chatcmpl-xyz",
"object": "chat.completion",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": null,
"tool_calls": [
{
"id": "call_abc123",
"type": "function",
"function": {
"name": "get_weather",
"arguments": "{\\"location\\": \\"Paris\\"}"
}
}
]
},
"finish_reason": "tool_calls"
}
],
"usage": { "prompt_tokens": 20, "completion_tokens": 25, "total_tokens": 45, "credits_used": 1 }
}
{
"model": "arya",
"messages": [
{ "role": "user", "content": "What is the weather in Paris?" },
{
"role": "assistant",
"content": null,
"tool_calls": [
{
"id": "call_abc123",
"type": "function",
"function": { "name": "get_weather", "arguments": "{\\"location\\": \\"Paris\\"}" }
}
]
},
{
"role": "tool",
"tool_call_id": "call_abc123",
"content": "Sunny, 22°C"
}
],
"tools": [
{ "type": "function", "function": { "name": "get_weather", "description": "Get weather for a location.", "parameters": { "type": "object", "properties": { "location": { "type": "string" } }, "required": ["location"] } } }
],
"tool_choice": "auto"
}
{
"model": "arya",
"messages": [
{
"role": "user",
"content": [
{ "type": "text", "text": "What's in this image?" },
{ "type": "image_url", "image_url": { "url": "https://example.com/photo.jpg" } }
]
}
]
}
{
"model": "arya",
"messages": [
{
"role": "user",
"content": [
{ "type": "text", "text": "Describe this receipt." },
{ "type": "image_url", "image_url": { "url": "data:image/png;base64,iVBORw0KGgoAAAA..." } }
]
}
]
}
{
"model": "arya",
"messages": [
{
"role": "user",
"content": [
{ "type": "text", "text": "Summarize this diagram." },
{ "type": "image_url", "image_url": { "file_id": "<FILE_ID_FROM_V1_FILES>" } }
]
}
]
}
{
"model": "arya",
"messages": [
{ "role": "system", "content": "Extract the user's name and age. Reply with JSON only." },
{ "role": "user", "content": "Hi, I'm Alice and I'm 30." }
],
"response_format": { "type": "json_object" }
}
{
"model": "arya",
"messages": [
{ "role": "user", "content": "Extract the user's name and age from: 'Hi, I'm Alice and I'm 30.'" }
],
"response_format": {
"type": "json_schema",
"json_schema": {
"name": "person",
"strict": true,
"schema": {
"type": "object",
"properties": {
"name": { "type": "string" },
"age": { "type": "integer" }
},
"required": ["name", "age"],
"additionalProperties": false
}
}
}
}
The OpenAI tool-call shape works identically across every supported provider — arya, gpt-*, claude-*, gemini-*, deepseek-*, kimi-*, qwen-*, etc. We translate to each provider's native format (e.g., Anthropic's tool_use/ tool_result blocks) on the way out and back. You always send and receive OpenAI-style tool_calls / role: "tool" messages, so existing OpenAI SDKs and agent frameworks (LangChain, Vercel AI SDK, OpenCode, LlamaIndex, …) work unchanged.
All responses include a usage object with token counts and credits_used showing how many credits were deducted for the request.
Use a public URL when the image is already hosted; use a base64 data URL for small, one-off images; and use /v1/files + file_id for larger images or anything you'll reference across multiple requests (avoids re-uploading and keeps request bodies small).
Structured outputs work best on models that advertise capabilities.function_calling: true. A few providers don't support strict json_schema — if the upstream rejects the schema you'll receive a 400 error rather than a silent fallback.
export OPENAI_BASE_URL="https://gab.ai/v1"
export OPENAI_API_KEY="YOUR_GAB_API_KEY"
codex
{
"model": "arya",
"input": [
{
"role": "user",
"content": [
{ "type": "input_text", "text": "What is the weather in Paris?" }
]
}
],
"tools": [
{
"type": "function",
"name": "get_weather",
"description": "Get weather for a location.",
"parameters": {
"type": "object",
"properties": {
"location": { "type": "string" }
},
"required": ["location"]
}
}
],
"tool_choice": { "type": "function", "name": "get_weather" }
}
{
"id": "resp_abc123",
"object": "response",
"status": "completed",
"model": "arya",
"output": [
{
"id": "fc_abc123",
"type": "function_call",
"status": "completed",
"call_id": "call_abc123",
"name": "get_weather",
"arguments": "{\\"location\\":\\"Paris\\"}"
}
],
"usage": {
"input_tokens": 80,
"output_tokens": 25,
"total_tokens": 105,
"credits_used": 1
}
}
{
"model": "arya",
"input": [
{
"role": "user",
"content": [{ "type": "input_text", "text": "What is the weather in Paris?" }]
},
{
"id": "fc_abc123",
"type": "function_call",
"call_id": "call_abc123",
"name": "get_weather",
"arguments": "{\\"location\\":\\"Paris\\"}"
},
{
"type": "function_call_output",
"call_id": "call_abc123",
"output": "{\\"temperature\\":22,\\"unit\\":\\"celsius\\",\\"condition\\":\\"sunny\\"}"
}
],
"tools": [
{
"type": "function",
"name": "get_weather",
"description": "Get weather for a location.",
"parameters": {
"type": "object",
"properties": { "location": { "type": "string" } },
"required": ["location"]
}
}
]
}
Use https://gab.ai/v1 as the OpenAI base URL and your Gab AI API key as OPENAI_API_KEY. Clients that call /v1/responses can use function tools, streaming, and function call outputs.
export ANTHROPIC_BASE_URL="https://gab.ai"
export ANTHROPIC_AUTH_TOKEN="YOUR_GAB_API_KEY"
claude
{
"model": "claude-sonnet-4-5",
"max_tokens": 1024,
"system": "You are a helpful coding assistant.",
"messages": [
{
"role": "user",
"content": [
{ "type": "text", "text": "Read package.json and summarize the scripts." }
]
}
],
"tools": [
{
"name": "read_file",
"description": "Read a file from the current workspace.",
"input_schema": {
"type": "object",
"properties": {
"path": { "type": "string" }
},
"required": ["path"]
}
}
],
"tool_choice": { "type": "auto" }
}
{
"id": "msg_abc123",
"type": "message",
"role": "assistant",
"model": "claude-sonnet-4-5-20250929",
"content": [
{
"type": "tool_use",
"id": "toolu_abc123",
"name": "read_file",
"input": { "path": "package.json" }
}
],
"stop_reason": "tool_use",
"stop_sequence": null,
"usage": {
"input_tokens": 120,
"output_tokens": 32
}
}
{
"model": "claude-sonnet-4-5",
"max_tokens": 1024,
"messages": [
{
"role": "user",
"content": [{ "type": "text", "text": "Read package.json and summarize the scripts." }]
},
{
"role": "assistant",
"content": [
{
"type": "tool_use",
"id": "toolu_abc123",
"name": "read_file",
"input": { "path": "package.json" }
}
]
},
{
"role": "user",
"content": [
{
"type": "tool_result",
"tool_use_id": "toolu_abc123",
"content": "{\\"scripts\\":{\\"start\\":\\"node server.js\\"}}"
}
]
}
],
"tools": [
{
"name": "read_file",
"description": "Read a file from the current workspace.",
"input_schema": {
"type": "object",
"properties": { "path": { "type": "string" } },
"required": ["path"]
}
}
]
}
curl https://gab.ai/v1/messages/count_tokens \\
-H "Content-Type: application/json" \\
-H "x-api-key: YOUR_GAB_API_KEY" \\
-d '{
"model": "claude-sonnet-4-5",
"messages": [
{ "role": "user", "content": [{ "type": "text", "text": "Hello!" }] }
]
}'
Set ANTHROPIC_BASE_URL to https://gab.ai and ANTHROPIC_AUTH_TOKEN to your Gab AI API key. Do not include /v1 in the base URL for Claude Code.
const response = await fetch('https://gab.ai/v1/chat/completions', {
method: 'POST',
headers: {
'Content-Type': 'application/json',
'Authorization': 'Bearer YOUR_API_KEY'
},
body: JSON.stringify({
model: 'arya',
messages: [{ role: 'user', content: 'Tell me a story' }],
stream: true
})
});
const reader = response.body.getReader();
const decoder = new TextDecoder();
while (true) {
const { done, value } = await reader.read();
if (done) break;
const chunk = decoder.decode(value);
// Parse SSE data and handle tokens
console.log(chunk);
}
The OpenAI SDK handles streaming automatically. Just set stream: true and iterate over the response.
curl "https://gab.ai/v1/models?type=embedding" \\
-H "Authorization: Bearer YOUR_API_KEY" \\
| jq '.data[].id'
const response = await fetch('https://gab.ai/v1/embeddings', {
method: 'POST',
headers: {
'Content-Type': 'application/json',
'Authorization': 'Bearer YOUR_API_KEY'
},
body: JSON.stringify({
model: 'YOUR_EMBEDDING_MODEL',
input: 'The quick brown fox jumps over the lazy dog'
})
});
from openai import OpenAI
client = OpenAI(
api_key="YOUR_API_KEY",
base_url="https://gab.ai/v1"
)
response = client.embeddings.create(
model="YOUR_EMBEDDING_MODEL",
input=["First document", "Second document"]
)
for item in response.data:
print(f"Index {item.index}: {len(item.embedding)} dimensions")
{
"object": "list",
"data": [
{
"object": "embedding",
"embedding": [0.0023, -0.0091, 0.0152, ...],
"index": 0
}
],
"model": "YOUR_EMBEDDING_MODEL",
"usage": {
"prompt_tokens": 9,
"total_tokens": 9,
"credits_used": 1
}
}
We don't have any embedding models enabled in production at the moment, so this endpoint will return 400 invalid_model until we re-enable one. Check GET /v1/models?type=embedding for the authoritative list — when it returns a non-empty data array, use one of those id values as the model field.
Pass an array of strings as input to embed multiple texts in a single request (up to 2048 items). This is significantly faster than making individual requests.
curl https://gab.ai/v1/credits \\
-H "Authorization: Bearer YOUR_API_KEY"
{
"object": "credit_balance",
"monthly_credits": 2000,
"monthly_used": 450,
"monthly_remaining": 1550,
"purchased_credits": 500,
"total_available": 2050,
"resets_at": "2026-02-01T00:00:00Z"
}
Monthly credits reset at the start of each billing cycle. Purchased credits never expire and are used after monthly credits are exhausted.
# Last 100 requests
curl https://gab.ai/v1/usage \\
-H "Authorization: Bearer YOUR_API_KEY"
# Poll for new activity since the last tx_id you've seen
curl "https://gab.ai/v1/usage?after=65f2a1b3c4d5e6f7a8b9c0d1" \\
-H "Authorization: Bearer YOUR_API_KEY"
# All chat.completions usage for March 2026
curl "https://gab.ai/v1/usage?start_date=2026-03-01&end_date=2026-03-31&endpoint=/v1/chat/completions&limit=1000" \\
-H "Authorization: Bearer YOUR_API_KEY"
{
"object": "list",
"data": [
{
"id": "65f2a1b3c4d5e6f7a8b9c0d1",
"object": "usage_record",
"created": 1712345678,
"endpoint": "/v1/chat/completions",
"model": "arya",
"credits_used": 1,
"tokens": {
"prompt": 842,
"completion": 156,
"total": 998
},
"context_tokens": 842,
"response_time_ms": 1245,
"status_code": 200,
"success": true
}
],
"has_more": false,
"first_id": "65f2a1b3c4d5e6f7a8b9c0d1",
"last_id": "65f2a1b3c4d5e6f7a8b9c0d1"
}
To efficiently tail new activity, persist the first_id from the previous response (the newest tx_id in that page) and pass it as after on the next call. Do not use last_id here: that field is the oldest item in the page, since results are returned newest-first.
# Get all models
curl https://gab.ai/v1/models \\
-H "Authorization: Bearer YOUR_API_KEY"
# Get only image models
curl "https://gab.ai/v1/models?type=image" \\
-H "Authorization: Bearer YOUR_API_KEY"
{
"object": "list",
"data": [
{
"id": "arya",
"object": "model",
"created": 1700000000,
"owned_by": "gab-ai",
"capabilities": {
"text": true,
"images": false,
"video": false,
"audio": false,
"streaming": true,
"thinking": false,
"web_search": true,
"function_calling": true,
"embeddings": false,
"image_input": true,
"file_input": true,
"audio_input": false,
"video_input": false
},
"context_window": 128000,
"max_output_tokens": 8192,
"credit_cost": { "base_cost": 1, "context_threshold": 20000 },
"is_plus_only": false
},
{
"id": "claude-opus-4-7",
"object": "model",
"created": 1700000000,
"owned_by": "anthropic",
"capabilities": {
"text": true,
"streaming": true,
"thinking": true,
"function_calling": true,
"image_input": true,
"file_input": true
},
"context_window": 200000,
"max_output_tokens": 8192,
"credit_cost": { "base_cost": 5, "context_threshold": 20000 },
"is_plus_only": true
},
{
"id": "gpt-image-2",
"object": "model",
"created": 1700000000,
"owned_by": "openai",
"capabilities": {
"images": true
},
"credit_cost": { "base_cost": 15, "context_threshold": 0 },
"is_plus_only": true
}
// ... more models
]
}
The capabilities object lists boolean flags for every feature the model supports. Notable flags for chat: text, streaming, function_calling, thinking, web_search, image_input, file_input. Pass tools only to models with function_calling: true — otherwise the request fails with unsupported_tool_calling.
const response = await fetch('https://gab.ai/v1/images/generations', {
method: 'POST',
headers: {
'Content-Type': 'application/json',
'Authorization': 'Bearer YOUR_API_KEY'
},
body: JSON.stringify({
model: 'gpt-image-2',
prompt: 'A serene mountain landscape at sunset with a lake reflection',
n: 1,
size: '1024x1024',
quality: 'hd'
})
});
{
"created": 1704067200,
"data": [
{
"url": "https://cdn.gab.ai/images/generated/abc123.png",
"revised_prompt": "A serene mountain landscape at sunset..."
}
],
"usage": {
"credits_used": 5
}
}
Use /v1/models?type=image to see all available image generation models, their capabilities, and max_prompt_characters per model. Long structured text (recipes, articles) often exceeds smaller models — summarize to a short visual description or use gpt-image-2.
curl -X POST https://gab.ai/v1/files \\
-H "Authorization: Bearer YOUR_API_KEY" \\
-F "purpose=vision" \\
-F "file=@./photo.png"
const form = new FormData();
form.append('purpose', 'vision');
form.append('file', fileBlob, 'photo.png');
const res = await fetch('https://gab.ai/v1/files', {
method: 'POST',
headers: { 'Authorization': 'Bearer YOUR_API_KEY' },
body: form,
});
const uploaded = await res.json();
// uploaded.id — use this as file_id in /v1/chat/completions
from openai import OpenAI
client = OpenAI(api_key="YOUR_API_KEY", base_url="https://gab.ai/v1")
uploaded = client.files.create(
file=open("photo.png", "rb"),
purpose="vision",
)
chat = client.chat.completions.create(
model="arya",
messages=[{
"role": "user",
"content": [
{"type": "text", "text": "What's in this image?"},
{"type": "image_url", "image_url": {"file_id": uploaded.id}},
],
}],
)
print(chat.choices[0].message.content)
{
"id": "67a1b2c3d4e5f6a7b8c9d0e1",
"object": "file",
"bytes": 184523,
"created_at": 1704067200,
"filename": "photo.png",
"purpose": "vision",
"mime_type": "image/png",
"file_type": "image",
"url": "https://cdn.gab.ai/users/abc/files/photo.png",
"status": "active"
}
curl "https://gab.ai/v1/files?purpose=vision&limit=50" \\
-H "Authorization: Bearer YOUR_API_KEY"
curl https://gab.ai/v1/files/FILE_ID \\
-H "Authorization: Bearer YOUR_API_KEY"
curl -X DELETE https://gab.ai/v1/files/FILE_ID \\
-H "Authorization: Bearer YOUR_API_KEY"
{ "id": "FILE_ID", "object": "file", "deleted": true }
Per-file upload limit is 50 MB. Individual types have their own caps in downstream processing (e.g., Whisper transcription is capped at 25 MB). Image types accepted: PNG, JPEG, GIF, WebP, HEIC, SVG.
const response = await fetch('https://gab.ai/v1/videos/generations', {
method: 'POST',
headers: {
'Content-Type': 'application/json',
'Authorization': 'Bearer YOUR_API_KEY'
},
body: JSON.stringify({
model: 'veo-3-1-fast',
prompt: 'A golden retriever running through a field of flowers',
duration: 5,
aspect_ratio: '16:9'
})
});
{
"id": "video_job_xyz789",
"object": "video.generation",
"status": "processing",
"created": 1704067200,
"model": "veo-3-1-fast",
"prompt": "A golden retriever running through a field of flowers",
"usage": {
"credits_used": 20
}
}
curl https://gab.ai/v1/videos/video_job_xyz789 \\
-H "Authorization: Bearer YOUR_API_KEY"
{
"id": "video_job_xyz789",
"object": "video.generation",
"status": "completed",
"created": 1704067200,
"completed_at": 1704067320,
"model": "veo-3-1-fast",
"data": [
{
"url": "https://cdn.gab.ai/videos/generated/xyz789.mp4",
"duration": 5,
"thumbnail": "https://cdn.gab.ai/videos/thumbnails/xyz789.jpg"
}
]
}
Video generation typically takes 1-3 minutes depending on duration and model. Implement exponential backoff when polling to avoid rate limits.
const response = await fetch('https://gab.ai/v1/audio/speech', {
method: 'POST',
headers: {
'Content-Type': 'application/json',
'Authorization': 'Bearer YOUR_API_KEY'
},
body: JSON.stringify({
model: 'gpt-4o-mini-tts',
input: 'Hello! Welcome to Gab AI. How can I help you today?',
voice: 'nova',
response_format: 'mp3',
speed: 1.0
})
});
const { url } = await response.json();
// Fetch the generated audio
const audio = await fetch(url).then(r => r.blob());
const audioUrl = URL.createObjectURL(audio);
import requests
resp = requests.post(
"https://gab.ai/v1/audio/speech",
headers={"Authorization": "Bearer YOUR_API_KEY"},
json={
"model": "gpt-4o-mini-tts",
"voice": "nova",
"input": "Hello! Welcome to Gab AI.",
},
).json()
# Download the generated audio
with open("output.mp3", "wb") as f:
f.write(requests.get(resp["url"]).content)
{
"url": "https://cdn.gab.ai/audio/generated/abc123.mp3",
"content_type": "audio/mp3",
"usage": { "credits_used": 1 }
}
OpenAI's client.audio.speech.create(...) expects raw audio bytes back, so calling Gab AI through the OpenAI Python SDK with response.stream_to_file() will not work. Use raw HTTP (or fetch the returned url after parsing JSON) instead.
curl https://gab.ai/v1/api-keys \\
-H "Authorization: Bearer YOUR_API_KEY"
{
"object": "list",
"data": [
{
"id": "65f2a1b3c4d5e6f7a8b9c0d1",
"object": "api_key",
"key": "gab_5e17...d85f",
"name": "Production Key",
"is_active": true,
"created": 1704067200,
"last_used": 1704672000,
"usage": { "requests": 4815, "tokens": 162342 }
},
{
"id": "65f2a1b3c4d5e6f7a8b9c0d2",
"object": "api_key",
"key": "gab_a1b2...c3d4",
"name": "Development Key",
"is_active": true,
"created": 1704153600,
"last_used": null,
"usage": { "requests": 0, "tokens": 0 }
}
]
}
curl -X POST https://gab.ai/v1/api-keys \\
-H "Authorization: Bearer YOUR_API_KEY" \\
-H "Content-Type: application/json" \\
-d '{"name": "New Production Key"}'
{
"id": "65f2a1b3c4d5e6f7a8b9c0d3",
"object": "api_key",
"key": "gab-5e17f695110d1e02c90e4537445d85fce4f5...",
"name": "New Production Key",
"is_active": true,
"created": 1704067200,
"message": "Save this key securely. It will not be shown again."
}
curl -X DELETE https://gab.ai/v1/api-keys/65f2a1b3c4d5e6f7a8b9c0d1 \\
-H "Authorization: Bearer YOUR_API_KEY"
{
"id": "65f2a1b3c4d5e6f7a8b9c0d1",
"object": "api_key",
"deleted": true
}
Keys created via this endpoint use the prefix gab- (followed by 64 hex chars). Keys created via the in-app Settings UI use the prefix gab_ (followed by 32 hex chars). Both forms are valid as Authorization: Bearer tokens — treat the entire string as opaque.
The full API key is only returned once when created. Store it securely—you won't be able to retrieve it again.
You cannot delete the API key being used to authenticate the request — the server returns 400 cannot_delete_current_key. Use a different active key (or the dashboard) to revoke it. Deletion is a soft delete: the key is marked inactive and rejected from then on.
# JSON bundle (default)
curl https://gab.ai/v1/account/export \\
-H "Authorization: Bearer YOUR_API_KEY" \\
-o gab-ai-account-data.json
# ZIP archive (same shape the Settings UI downloads)
curl "https://gab.ai/v1/account/export?format=zip" \\
-H "Authorization: Bearer YOUR_API_KEY" \\
-o gab-ai-account-data.zip
# Slim JSON — metadata only, no message bodies
curl "https://gab.ai/v1/account/export?include_messages=false" \\
-H "Authorization: Bearer YOUR_API_KEY" \\
-o gab-ai-account-data-slim.json
{
"object": "account_data_export",
"exported_at": "2026-05-02T18:00:00.000Z",
"user_id": "65f2a1b3c4d5e6f7a8b9c0d1",
"counts": {
"conversations": 412,
"memories": 87,
"files": 23,
"agents": 4,
"collections": 6,
"voice_sessions": 18,
"purchases": 3,
"credit_purchases": 11,
"referrals": 2,
"feedback": 1,
"inference_tasks": 0,
"bookmark_folders": 5,
"api_keys": 2
},
"truncated": {},
"data": {
"user": { "_id": "...", "email": "...", "username": "...", "...": "..." },
"conversations": [ /* full conversation rows with messages */ ],
"memories": [ /* memory rows (embeddings stripped) */ ],
"files": [ /* file metadata + cdn paths */ ],
"agents": [ /* your custom agents */ ],
"collections": [ /* your collections */ ],
"voice_sessions": [ /* voice mode session metadata */ ],
"purchases": [ /* Plus subscription purchases */ ],
"credit_purchases": [ /* one-time credit packs */ ],
"referrals": [ /* referrals you sent or received */ ],
"feedback": [ /* feedback you've submitted */ ],
"inference_tasks": [ /* scheduled / recurring tasks */ ],
"bookmark_folders": [ /* bookmark organization */ ],
"api_keys": [ /* metadata only — secret values are never re-exported */ ]
}
}
gab-ai-account-data-2026-05-02.zip
├── README.md # human-readable overview, counts, layout
├── manifest.json # exported_at, counts, truncated, schema_version
├── profile.json # your account profile
├── conversations.json # raw bundle of every conversation
├── conversations/ # one .md file per conversation, easy to read
│ ├── 2026-05-01-untitled-conversation-abc123.md
│ └── ...
├── memories.json # raw memory bundle
├── memories.txt # one memory per line, easy to grep
├── files.json
├── agents.json
├── collections.json
├── voice_sessions.json
├── bookmark_folders.json
├── inference_tasks.json
├── purchases.json
├── credit_purchases.json
├── referrals.json
├── feedback.json
└── api_keys.json # metadata only
Deleted records and temporary ("incognito") chats are filtered out before we build the archive. We also strip credentials (passwords, password reset tokens, MFA secrets, backup codes), payment-processor IDs (Authorize.net / Valmar customer + payment-method IDs), memory embedding vectors, files attached to temporary chats, and the secret value of API keys. Everything else that belongs to your account is in the export.
Each collection is capped at a generous limit (5,000 conversations, 10,000 files, etc.). If a collection trips its cap the JSON response includes the key in the top-level truncated object (and the ZIP's README spells it out). Email support@gab.ai for a complete offline archive if you need the full history.
The export endpoint is intentionally heavy. The same daily API rate limit applies, but in practice you should only need a handful of calls — cache the result locally rather than polling.
{
"error": {
"message": "Insufficient credits to complete request",
"type": "insufficient_credits",
"code": "credits_exhausted",
"param": null
}
}
When you receive a 429 error, check the X-RateLimit-Reset header to know when you can resume making requests.