API Endpoints

Complete reference for all Gab AI API endpoints: chat completions (with vision / image input), file uploads, image generation, video generation, text-to-speech, embeddings, and more.

Base URL

https://gab.ai/v1

Chat Completions

Generate a response from an AI model based on a conversation history. This is the primary endpoint for text generation and follows the OpenAI chat completions format. You can pass a tools array and optional tool_choice in the request body (see the table above). Supported models can then request one or more tool calls; you run the tools and send the results back in follow-up requests. Tools are only used when you send tools and the model supports function calling (see model capabilities below). Assistant messages in messages may include tool_calls: an array of objects with id, type: "function", and function: { name, arguments }. Tool result messages use role: "tool" and must include tool_call_id (matching the id of the corresponding tool call) and content (the tool’s result string). choices[0].message may include content (string or null) and/or tool_calls (array with the same shape: id, type, function.name, function.arguments). choices[0].finish_reason may be "tool_calls" when the model is requesting tool use (in addition to "stop", "length", etc.). Stream chunks may include delta.tool_calls (array of tool call deltas). The final chunk may have finish_reason: "tool_calls" when the model is requesting tool use. Send messages plus optional tools and tool_choice. If the response has finish_reason: "tool_calls" and message.tool_calls, run each tool, append the assistant message (including its tool_calls) to messages, append one message per result with role "tool", tool_call_id, and content, then call the API again with the updated messages. Repeat until finish_reason is "stop" or another non–tool-call reason. Vision-capable models accept images in the messages array using OpenAI's multimodal content format: a user message's content can be an array of parts, where each part is either {'{ type: "text", text }'} or {'{ type: "image_url", image_url: { url } }'}. The image URL may be a public HTTPS URL, a base64 data URL, or a { file_id } returned by /v1/files. Use /v1/models to discover vision-capable models — look for capabilities.image_input: true. Passing images to a model that does not support image input returns a 400 unsupported_image_input error. Pass response_format to constrain output to valid JSON, either freely (json_object) or against a strict JSON Schema (json_schema). The shape matches OpenAI's structured-outputs API exactly. POST /chat/completions

Request Body

model (string, required)

Model ID (e.g., "arya", "gpt-5-5", "claude-opus-4-8", "gemini-3-5-flash"). Must be a slug returned by GET /v1/models — names like "Claude Opus 4.8" or "claude opus" will not match.

messages (array, required)

Array of message objects with role and content. Assistant messages may include tool_calls; tool result messages use role "tool" with tool_call_id and content.

temperature (number, optional)

Sampling temperature (0-2, default 1)

max_tokens (integer, optional)

Maximum tokens to generate

stream (boolean, optional)

Enable streaming responses (default false)

top_p (number, optional)

Nucleus sampling parameter (0-1)

tools (array, optional)

List of tools the model may call. Each item has type: "function" and a function object with name, description, and parameters (JSON Schema). Same structure as OpenAI's tools parameter.

tool_choice (string | object, optional)

How the model should use tools: "auto" (let the model decide), "none" (disable tools), or { type: "function", function: { name: "<tool_name>" } } to force a specific tool.

response_format (object, optional)

Structured output mode. Supports OpenAI's shape: { type: "text" } (default), { type: "json_object" }, or { type: "json_schema", json_schema: { name, strict?, schema } } where schema is a JSON Schema. Best results on models with capabilities.function_calling: true.

Request Example

Response Example

Request with tools (non-streaming)

Response with tool_calls

Follow-up request with tool results

Image via public URL

Image via base64 data URL

Image via /v1/files upload (recommended for large / reused images)

json_object — free-form JSON

json_schema — strict, typed output

Provider-agnostic

The OpenAI tool-call shape works identically across every supported provider — arya, gpt-*, claude-*, gemini-*, deepseek-*, kimi-*, qwen-*, etc. We translate to each provider's native format (e.g., Anthropic's tool_use/ tool_result blocks) on the way out and back. You always send and receive OpenAI-style tool_calls / role: "tool" messages, so existing OpenAI SDKs and agent frameworks (LangChain, Vercel AI SDK, OpenCode, LlamaIndex, …) work unchanged.

Token Usage

All responses include a usage object with token counts and credits_used showing how many credits were deducted for the request.

Which image input should I use?

Use a public URL when the image is already hosted; use a base64 data URL for small, one-off images; and use /v1/files + file_id for larger images or anything you'll reference across multiple requests (avoids re-uploading and keeps request bodies small).

Compatibility

Structured outputs work best on models that advertise capabilities.function_calling: true. A few providers don't support strict json_schema — if the upstream rejects the schema you'll receive a 400 error rather than a silent fallback.

OpenAI Responses

Gab AI supports OpenAI's newer Responses API shape for Codex-style agent clients and SDKs that use input / output items instead of chat messages. The endpoint maps Responses API function calls onto the same tool-calling pipeline as chat completions. Streaming responses emit Responses-style events such as response.created, response.output_text.delta, response.function_call_arguments.done, and response.completed. POST /responses

Request Body

model (string, required)

Model ID, such as "arya", "gpt-5-5", or "claude-sonnet-4-5".

input (string | array, required)

A plain prompt string or Responses API input items: message, function_call, and function_call_output.

max_output_tokens (integer, optional)

Maximum output tokens. Also accepts max_tokens for compatibility.

temperature (number, optional)

Sampling temperature.

top_p (number, optional)

Nucleus sampling parameter.

stream (boolean, optional)

Enable Responses-style server-sent events.

tools (array, optional)

Responses API function tools: { type: "function", name, description, parameters }.

tool_choice (string | object, optional)

"auto", "none", "required", or { type: "function", name: "tool_name" }.

Codex-style environment

Request with a function tool

Response with function_call

Follow-up with function_call_output

Codex / OpenAI-compatible CLI setup

Use https://gab.ai/v1 as the OpenAI base URL and your Gab AI API key as OPENAI_API_KEY. Clients that call /v1/responses can use function tools, streaming, and function call outputs.

Anthropic Messages

Gab AI also exposes an Anthropic-compatible Messages API for tools and SDKs that expect Anthropic's native format, including Claude Code. Use the root base URL https://gab.ai for Anthropic-compatible clients because they append /v1/messages themselves. Anthropic-compatible clients may call /v1/messages/count_tokens to estimate input tokens before sending a message. The response returns input_tokens. POST /messages POST /messages/count_tokens

Request Body

model (string, required)

Model ID. You may use a Gab slug such as "claude-sonnet-4-5" or the provider model name such as "claude-sonnet-4-5-20250929".

messages (array, required)

Anthropic messages array. User/assistant messages use content blocks; assistant tool requests use type "tool_use"; tool results are user messages with type "tool_result".

system (string | array, optional)

Anthropic system prompt. String content and text blocks are supported.

max_tokens (integer, optional)

Maximum tokens to generate. Forwarded to Claude-compatible models.

temperature (number, optional)

Sampling temperature.

top_p (number, optional)

Nucleus sampling parameter.

top_k (number, optional)

Top-k sampling parameter for Anthropic models.

stream (boolean, optional)

Enable Anthropic-style server-sent events.

tools (array, optional)

Anthropic tool definitions with name, description, and input_schema. These are translated to Gab AI's internal tool-call format.

tool_choice (object, optional)

Anthropic tool choice, such as { type: "auto" }, { type: "any" }, { type: "none" }, or { type: "tool", name: "tool_name" }.

Claude Code

Request with tools

Response with tool_use

Follow-up with tool_result

Count tokens

Claude Code setup

Set ANTHROPIC_BASE_URL to https://gab.ai and ANTHROPIC_AUTH_TOKEN to your Gab AI API key. Do not include /v1 in the base URL for Claude Code.

Streaming Responses

Set stream: true to receive responses as server-sent events (SSE) for real-time output:

Streaming Example

OpenAI SDK Support

The OpenAI SDK handles streaming automatically. Just set stream: true and iterate over the response.

Embeddings

Generate vector embeddings for text inputs. Embeddings are useful for semantic search, retrieval-augmented generation (RAG), clustering, and classification tasks. The endpoint shape mirrors OpenAI's /v1/embeddings. POST /embeddings

Request Body

model (string, required)

Embedding model ID — must be a slug from /v1/models?type=embedding.

input (string | array, required)

Text to embed. A single string or an array of strings (max 2048 items).

dimensions (integer, optional)

Output vector dimensions. Forwarded to the upstream provider for models that support reduced dimensionality.

encoding_format (string, optional)

Encoding format: "float" (default) or "base64"

Discover the current embedding model

Request Example (replace YOUR_EMBEDDING_MODEL with a slug from above)

Python Example

Response Example

Currently no enabled embedding models

We don't have any embedding models enabled in production at the moment, so this endpoint will return 400 invalid_model until we re-enable one. Check GET /v1/models?type=embedding for the authoritative list — when it returns a non-empty data array, use one of those id values as the model field.

Batching

Pass an array of strings as input to embed multiple texts in a single request (up to 2048 items). This is significantly faster than making individual requests.

Check Credits

Check your current credit balance including monthly allotment and purchased credits. GET /credits

Request Example

Response Example

Credit Types

Monthly credits reset at the start of each billing cycle. Purchased credits never expire and are used after monthly credits are exhausted.

Usage History

Fetch a per-request log of your API usage. Each record is a single billable transaction with its own tx_id, model, token counts and credit cost — useful for auditing billing, building dashboards, or reconciling spend. Records are returned newest-first. GET /usage

Query Parameters

after (string, optional)

tx_id cursor — return records strictly newer than this one. When polling, pass the previous page's first_id (the newest tx in that response, same as data[0].id) — not last_id.

before (string, optional)

tx_id cursor — return records strictly older than this one. Use to walk backward through history.

start_date (string | number, optional)

Inclusive lower bound on created time. ISO 8601 (e.g. "2026-04-01") or unix timestamp (seconds or milliseconds).

end_date (string | number, optional)

Inclusive upper bound on created time. Same formats as start_date.

endpoint (string, optional)

Filter to a specific endpoint, e.g. "/v1/chat/completions".

model (string, optional)

Filter to a specific model slug (e.g. "arya").

limit (integer, optional)

Records per page. 1–1000, default 100.

Request Examples

Response Example

Polling pattern

To efficiently tail new activity, persist the first_id from the previous response (the newest tx_id in that page) and pass it as after on the next call. Do not use last_id here: that field is the oldest item in the page, since results are returned newest-first.

List Models

List all available models. Optionally filter by type to get only text, image, video, audio, or embedding models. GET /models

Query Parameters

type (string, optional)

Filter by model type: "text", "image", "video", "audio", or "embedding"

Request Examples

Response Example

Model capabilities

The capabilities object lists boolean flags for every feature the model supports. Notable flags for chat: text, streaming, function_calling, thinking, web_search, image_input, file_input. Pass tools only to models with function_calling: true — otherwise the request fails with unsupported_tool_calling.

Image Generation

Generate images from text prompts using various AI image models. Returns URLs to the generated images. POST /images/generations

Request Body

model (string, optional)

Image model slug (e.g., "gpt-image-2", "gpt-image-1", "gpt-image-1-mini", "imagen-4-0", "nano-banana-2", "seedream-4-5", "seedream-4-0", "qwen-image-2", "qwen-image", "gemini-2-5-flash-image"). Defaults to "gpt-image-1-mini". Use GET /v1/models?type=image for the live list.

prompt (string, required)

Text description of the image to generate. Maximum length is model-specific (characters). If exceeded, the API returns 400 with code prompt_too_long. Use GET /v1/models?type=image and max_prompt_characters on each model (e.g. image-generator: 2,048; gpt-image-2: 32,000).

n (integer, optional)

Number of images to generate (1-4, default 1). Some models cap n=1 internally.

size (string, optional)

Image size, model-dependent. Common values: "1024x1024", "1792x1024", "1024x1792".

quality (string, optional)

Quality level: "standard" (default) or "hd". Forwarded to providers that support it.

Request Example

Response Example

Image Models

Use /v1/models?type=image to see all available image generation models, their capabilities, and max_prompt_characters per model. Long structured text (recipes, articles) often exceeds smaller models — summarize to a short visual description or use gpt-image-2.

Files

Upload a file once and reference it by file_id in other API calls — most commonly as an image input to /v1/chat/completions. Files are scoped to the authenticated user (the owner of the API key) and are not accessible to other users. The response shape is OpenAI-compatible. Upload a file using multipart/form-data. Returns a file object with an id you can use as file_id in subsequent requests. List files uploaded by the current API key's user. Optionally filter by purpose. Retrieve metadata for a single file. Delete a file. Removes the stored object and marks the file as deleted. Any outstanding references (e.g., file_id in a chat message) will fail with file_not_found after deletion. POST /files GET /files GET /files/:fileId DELETE /files/:fileId

Form Fields

file (file, required)

The binary file to upload (multipart field "file").

purpose (string, optional)

What the file will be used for: "vision" (image for chat), "assistants", "user_data" (default), "batch", or "fine-tune". Purpose "vision" requires an image file.

Query Parameters

limit (integer, optional)

Page size (1-100, default 20).

purpose (string, optional)

Filter by purpose (e.g., "vision").

Upload with curl

Upload with fetch

Upload with OpenAI SDK

Response

List files

Retrieve file

Delete file

Response

Limits

Per-file upload limit is 50 MB. Individual types have their own caps in downstream processing (e.g., Whisper transcription is capped at 25 MB). Image types accepted: PNG, JPEG, GIF, WebP, HEIC, SVG.

Video Generation

Generate videos from text prompts. Video generation is asynchronous—you'll receive a job ID and poll for the result. Start a video generation job. Returns a job ID to poll for completion. Check the status of a video generation job. Poll this endpoint until status is "completed" or "failed". POST /videos/generations GET /videos/:jobId

Request Body

model (string, optional)

Video model slug (e.g., "veo-3-1", "veo-3-1-fast", "sora-2", "kling-3-0-pro", "kling-2-5-turbo-pro", "hailuo-2-3-pro", "wan-2-5", "seedance-2-0"). Defaults to "veo-3-1-fast". Use GET /v1/models?type=video for the live list.

prompt (string, required)

Text description of the video to generate

duration (integer, optional)

Video duration in seconds (model dependent)

aspect_ratio (string, optional)

Aspect ratio: "16:9", "9:16", "1:1"

Start Generation

Response (HTTP 202 — job queued)

Poll Status

Completed Response

Processing Time

Video generation typically takes 1-3 minutes depending on duration and model. Implement exponential backoff when polling to avoid rate limits.

Text-to-Speech

Convert text to natural-sounding speech. Unlike OpenAI's TTS endpoint, Gab AI returns a JSON object with a CDN URL to the generated audio (rather than streaming raw audio bytes). You can fetch the URL to download the file. POST /audio/speech

Request Body

model (string, optional)

TTS model slug. Currently "gpt-4o-mini-tts" (the default if omitted). Use GET /v1/models?type=audio to discover others as we add them.

input (string, required)

The text to convert to speech (max 4096 chars)

voice (string, optional)

Voice ID. For gpt-4o-mini-tts: "alloy" (default), "ash", "ballad", "coral", "echo", "fable", "nova", "onyx", "sage", "shimmer", "verse".

response_format (string, optional)

Audio format: "mp3" (default), "opus", "aac", "flac".

speed (number, optional)

Speaking speed (0.25 to 4.0, default 1.0)

instructions (string, optional)

Optional voice/style instructions forwarded to the model (e.g., "Speak in a calm, slow voice").

Request Example

Python Example (raw HTTP — not the OpenAI SDK)

Response Example

Differs from OpenAI's SDK

OpenAI's client.audio.speech.create(...) expects raw audio bytes back, so calling Gab AI through the OpenAI Python SDK with response.stream_to_file() will not work. Use raw HTTP (or fetch the returned url after parsing JSON) instead.

API Key Management

Programmatically manage your API keys. List existing keys, create new ones, or revoke keys that are no longer needed. List all API keys associated with your account. Keys are returned with masked values for security. Create a new API key. The full key is only shown once in the response—store it securely. Revoke an API key. The key will immediately stop working for all requests. GET /api-keys POST /api-keys DELETE /api-keys/:keyId

Request Body

name (string, optional)

A friendly name for the key (e.g., "Production", "Dev Server")

Request

Response

Request

Response (HTTP 201)

Request

Response

Key prefix

Keys created via this endpoint use the prefix gab- (followed by 64 hex chars). Keys created via the in-app Settings UI use the prefix gab_ (followed by 32 hex chars). Both forms are valid as Authorization: Bearer tokens — treat the entire string as opaque.

Save Your Key

The full API key is only returned once when created. Store it securely—you won't be able to retrieve it again.

Self-deletion blocked

You cannot delete the API key being used to authenticate the request — the server returns 400 cannot_delete_current_key. Use a different active key (or the dashboard) to revoke it. Deletion is a soft delete: the key is marked inactive and rejected from then on.

Account Data Export

Download a portable archive of every record we hold for the API key's owner — profile, conversations, memories, files, custom agents, collections, voice sessions, bookmark folders, automated tasks, purchases, credit purchases, referrals, and feedback. This is the same data that powers the in-app Settings → Download your data button, exposed here for backup tools, GDPR / CCPA workflows, and self-hosted analytics pipelines. The ZIP wraps the same data in a structure that's easy to browse without writing any code. Schema version is pinned in the manifest so consumers can detect breaking changes. GET /account/export

Query Parameters

format (string, optional)

Either "json" (default for the developer API — one big JSON document) or "zip" (a user-friendly archive with per-collection JSON files, individual conversation markdown files, a README, and a manifest).

include_messages (boolean, optional)

When set to "false", conversations are returned without their (often very large) messages array. Defaults to true so you get the full archive.

Request

JSON Response Shape

Archive contents

What's excluded

Deleted records and temporary ("incognito") chats are filtered out before we build the archive. We also strip credentials (passwords, password reset tokens, MFA secrets, backup codes), payment-processor IDs (Authorize.net / Valmar customer + payment-method IDs), memory embedding vectors, files attached to temporary chats, and the secret value of API keys. Everything else that belongs to your account is in the export.

Truncation

Each collection is capped at a generous limit (5,000 conversations, 10,000 files, etc.). If a collection trips its cap the JSON response includes the key in the top-level truncated object (and the ZIP's README spells it out). Email support@gab.ai for a complete offline archive if you need the full history.

Rate limits

The export endpoint is intentionally heavy. The same daily API rate limit applies, but in practice you should only need a handful of calls — cache the result locally rather than polling.

Message Roles

The messages array in chat completions supports these roles:

system

Sets the behavior and context for the assistant. API requests do NOT inherit the API key owner's profile (name, location, etc.), so if you want the assistant to know who the end user is, describe them here (e.g. "You are chatting with Jane from Austin.").

user

The human user's messages

assistant

Previous AI responses (for context). May include tool_calls: array of { '{ id, type: "function", function: { name, arguments } }' } when using tools.

tool

Tool result messages. Must include tool_call_id (matching the tool call id) and content (the tool’s result string).

Error Codes

All errors follow a consistent format with an error object containing message, type, and code: The error.code field gives a stable machine-readable identifier independent of the HTTP status. The most common ones:

  1. 400 — Bad request — Invalid parameters or malformed request
  2. 401 — Unauthorized — Invalid or missing API key
  3. 402 — Payment required — Insufficient credits
  4. 403 — Forbidden — Access denied to this resource
  5. 404 — Not found — Invalid model, endpoint, or resource
  6. 429 — Rate limited — Too many requests, check rate limit headers
  7. 500 — Server error — Internal error, try again later
  8. 502 — Bad upstream response — e.g. invalid_tool_calls (auto-refunded)
  9. 503 — Service unavailable — Model temporarily unavailable
  1. missing_api_key — Authorization header not provided.
  2. invalid_api_key — API key is invalid or revoked.
  3. credits_exhausted — No credits remaining on the account.
  4. plus_required — Model requires a Gab AI Plus subscription.
  5. rate_limit_exceeded — Daily request limit reached.
  6. invalid_model — Model not found or disabled.
  7. unsupported_tool_calling — tools was passed to a model with function_calling: false. Check /v1/models capabilities.
  8. unsupported_image_input — image_url / input_image part was passed to a model with image_input: false.
  9. invalid_tool_calls — Every tool_call returned by the model was malformed (invalid JSON args, unknown tool name, etc.). The inference is automatically refunded; offending calls are included in the error payload.
  10. file_not_found — A file_id reference could not be resolved or is not owned by the API key user.
  11. inference_failed — Upstream model provider failed to generate a response.

Error Response Format

Rate Limit Headers

All API responses include rate limit headers to help you manage your usage:

  1. X-RateLimit-Limit — Maximum requests allowed per day (10,000 for Plus)
  2. X-RateLimit-Remaining — Number of requests remaining in the current period
  3. X-RateLimit-Reset — Unix timestamp when the rate limit resets

Handling Rate Limits

When you receive a 429 error, check the X-RateLimit-Reset header to know when you can resume making requests.

const response = await fetch('https://gab.ai/v1/chat/completions', {
  method: 'POST',
  headers: {
    'Content-Type': 'application/json',
    'Authorization': 'Bearer YOUR_API_KEY'
  },
  body: JSON.stringify({
    model: 'arya',
    messages: [
      { role: 'system', content: 'You are a helpful assistant.' },
      { role: 'user', content: 'What is the capital of France?' }
    ],
    temperature: 0.7,
    max_tokens: 1000
  })
});
{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "created": 1704067200,
  "model": "arya",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "The capital of France is Paris."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 25,
    "completion_tokens": 12,
    "total_tokens": 37,
    "credits_used": 1
  }
}
{
  "model": "arya",
  "messages": [
    { "role": "user", "content": "What is the weather in Paris?" }
  ],
  "tools": [
    {
      "type": "function",
      "function": {
        "name": "get_weather",
        "description": "Get the current weather for a location.",
        "parameters": {
          "type": "object",
          "properties": {
            "location": { "type": "string", "description": "City name" }
          },
          "required": ["location"]
        }
      }
    }
  ],
  "tool_choice": "auto"
}
{
  "id": "chatcmpl-xyz",
  "object": "chat.completion",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": null,
        "tool_calls": [
          {
            "id": "call_abc123",
            "type": "function",
            "function": {
              "name": "get_weather",
              "arguments": "{\\"location\\": \\"Paris\\"}"
            }
          }
        ]
      },
      "finish_reason": "tool_calls"
    }
  ],
  "usage": { "prompt_tokens": 20, "completion_tokens": 25, "total_tokens": 45, "credits_used": 1 }
}
{
  "model": "arya",
  "messages": [
    { "role": "user", "content": "What is the weather in Paris?" },
    {
      "role": "assistant",
      "content": null,
      "tool_calls": [
        {
          "id": "call_abc123",
          "type": "function",
          "function": { "name": "get_weather", "arguments": "{\\"location\\": \\"Paris\\"}" }
        }
      ]
    },
    {
      "role": "tool",
      "tool_call_id": "call_abc123",
      "content": "Sunny, 22°C"
    }
  ],
  "tools": [
    { "type": "function", "function": { "name": "get_weather", "description": "Get weather for a location.", "parameters": { "type": "object", "properties": { "location": { "type": "string" } }, "required": ["location"] } } }
  ],
  "tool_choice": "auto"
}
{
  "model": "arya",
  "messages": [
    {
      "role": "user",
      "content": [
        { "type": "text", "text": "What's in this image?" },
        { "type": "image_url", "image_url": { "url": "https://example.com/photo.jpg" } }
      ]
    }
  ]
}
{
  "model": "arya",
  "messages": [
    {
      "role": "user",
      "content": [
        { "type": "text", "text": "Describe this receipt." },
        { "type": "image_url", "image_url": { "url": "data:image/png;base64,iVBORw0KGgoAAAA..." } }
      ]
    }
  ]
}
{
  "model": "arya",
  "messages": [
    {
      "role": "user",
      "content": [
        { "type": "text", "text": "Summarize this diagram." },
        { "type": "image_url", "image_url": { "file_id": "<FILE_ID_FROM_V1_FILES>" } }
      ]
    }
  ]
}
{
  "model": "arya",
  "messages": [
    { "role": "system", "content": "Extract the user's name and age. Reply with JSON only." },
    { "role": "user", "content": "Hi, I'm Alice and I'm 30." }
  ],
  "response_format": { "type": "json_object" }
}
{
  "model": "arya",
  "messages": [
    { "role": "user", "content": "Extract the user's name and age from: 'Hi, I'm Alice and I'm 30.'" }
  ],
  "response_format": {
    "type": "json_schema",
    "json_schema": {
      "name": "person",
      "strict": true,
      "schema": {
        "type": "object",
        "properties": {
          "name": { "type": "string" },
          "age":  { "type": "integer" }
        },
        "required": ["name", "age"],
        "additionalProperties": false
      }
    }
  }
}

Provider-agnostic

The OpenAI tool-call shape works identically across every supported provider — arya, gpt-*, claude-*, gemini-*, deepseek-*, kimi-*, qwen-*, etc. We translate to each provider's native format (e.g., Anthropic's tool_use/ tool_result blocks) on the way out and back. You always send and receive OpenAI-style tool_calls / role: "tool" messages, so existing OpenAI SDKs and agent frameworks (LangChain, Vercel AI SDK, OpenCode, LlamaIndex, …) work unchanged.

Token Usage

All responses include a usage object with token counts and credits_used showing how many credits were deducted for the request.

Which image input should I use?

Use a public URL when the image is already hosted; use a base64 data URL for small, one-off images; and use /v1/files + file_id for larger images or anything you'll reference across multiple requests (avoids re-uploading and keeps request bodies small).

Compatibility

Structured outputs work best on models that advertise capabilities.function_calling: true. A few providers don't support strict json_schema — if the upstream rejects the schema you'll receive a 400 error rather than a silent fallback.

export OPENAI_BASE_URL="https://gab.ai/v1"
export OPENAI_API_KEY="YOUR_GAB_API_KEY"

codex
{
  "model": "arya",
  "input": [
    {
      "role": "user",
      "content": [
        { "type": "input_text", "text": "What is the weather in Paris?" }
      ]
    }
  ],
  "tools": [
    {
      "type": "function",
      "name": "get_weather",
      "description": "Get weather for a location.",
      "parameters": {
        "type": "object",
        "properties": {
          "location": { "type": "string" }
        },
        "required": ["location"]
      }
    }
  ],
  "tool_choice": { "type": "function", "name": "get_weather" }
}
{
  "id": "resp_abc123",
  "object": "response",
  "status": "completed",
  "model": "arya",
  "output": [
    {
      "id": "fc_abc123",
      "type": "function_call",
      "status": "completed",
      "call_id": "call_abc123",
      "name": "get_weather",
      "arguments": "{\\"location\\":\\"Paris\\"}"
    }
  ],
  "usage": {
    "input_tokens": 80,
    "output_tokens": 25,
    "total_tokens": 105,
    "credits_used": 1
  }
}
{
  "model": "arya",
  "input": [
    {
      "role": "user",
      "content": [{ "type": "input_text", "text": "What is the weather in Paris?" }]
    },
    {
      "id": "fc_abc123",
      "type": "function_call",
      "call_id": "call_abc123",
      "name": "get_weather",
      "arguments": "{\\"location\\":\\"Paris\\"}"
    },
    {
      "type": "function_call_output",
      "call_id": "call_abc123",
      "output": "{\\"temperature\\":22,\\"unit\\":\\"celsius\\",\\"condition\\":\\"sunny\\"}"
    }
  ],
  "tools": [
    {
      "type": "function",
      "name": "get_weather",
      "description": "Get weather for a location.",
      "parameters": {
        "type": "object",
        "properties": { "location": { "type": "string" } },
        "required": ["location"]
      }
    }
  ]
}

Codex / OpenAI-compatible CLI setup

Use https://gab.ai/v1 as the OpenAI base URL and your Gab AI API key as OPENAI_API_KEY. Clients that call /v1/responses can use function tools, streaming, and function call outputs.

export ANTHROPIC_BASE_URL="https://gab.ai"
export ANTHROPIC_AUTH_TOKEN="YOUR_GAB_API_KEY"

claude
{
  "model": "claude-sonnet-4-5",
  "max_tokens": 1024,
  "system": "You are a helpful coding assistant.",
  "messages": [
    {
      "role": "user",
      "content": [
        { "type": "text", "text": "Read package.json and summarize the scripts." }
      ]
    }
  ],
  "tools": [
    {
      "name": "read_file",
      "description": "Read a file from the current workspace.",
      "input_schema": {
        "type": "object",
        "properties": {
          "path": { "type": "string" }
        },
        "required": ["path"]
      }
    }
  ],
  "tool_choice": { "type": "auto" }
}
{
  "id": "msg_abc123",
  "type": "message",
  "role": "assistant",
  "model": "claude-sonnet-4-5-20250929",
  "content": [
    {
      "type": "tool_use",
      "id": "toolu_abc123",
      "name": "read_file",
      "input": { "path": "package.json" }
    }
  ],
  "stop_reason": "tool_use",
  "stop_sequence": null,
  "usage": {
    "input_tokens": 120,
    "output_tokens": 32
  }
}
{
  "model": "claude-sonnet-4-5",
  "max_tokens": 1024,
  "messages": [
    {
      "role": "user",
      "content": [{ "type": "text", "text": "Read package.json and summarize the scripts." }]
    },
    {
      "role": "assistant",
      "content": [
        {
          "type": "tool_use",
          "id": "toolu_abc123",
          "name": "read_file",
          "input": { "path": "package.json" }
        }
      ]
    },
    {
      "role": "user",
      "content": [
        {
          "type": "tool_result",
          "tool_use_id": "toolu_abc123",
          "content": "{\\"scripts\\":{\\"start\\":\\"node server.js\\"}}"
        }
      ]
    }
  ],
  "tools": [
    {
      "name": "read_file",
      "description": "Read a file from the current workspace.",
      "input_schema": {
        "type": "object",
        "properties": { "path": { "type": "string" } },
        "required": ["path"]
      }
    }
  ]
}
curl https://gab.ai/v1/messages/count_tokens \\
  -H "Content-Type: application/json" \\
  -H "x-api-key: YOUR_GAB_API_KEY" \\
  -d '{
    "model": "claude-sonnet-4-5",
    "messages": [
      { "role": "user", "content": [{ "type": "text", "text": "Hello!" }] }
    ]
  }'

Claude Code setup

Set ANTHROPIC_BASE_URL to https://gab.ai and ANTHROPIC_AUTH_TOKEN to your Gab AI API key. Do not include /v1 in the base URL for Claude Code.

const response = await fetch('https://gab.ai/v1/chat/completions', {
  method: 'POST',
  headers: {
    'Content-Type': 'application/json',
    'Authorization': 'Bearer YOUR_API_KEY'
  },
  body: JSON.stringify({
    model: 'arya',
    messages: [{ role: 'user', content: 'Tell me a story' }],
    stream: true
  })
});

const reader = response.body.getReader();
const decoder = new TextDecoder();

while (true) {
  const { done, value } = await reader.read();
  if (done) break;
  
  const chunk = decoder.decode(value);
  // Parse SSE data and handle tokens
  console.log(chunk);
}

OpenAI SDK Support

The OpenAI SDK handles streaming automatically. Just set stream: true and iterate over the response.

curl "https://gab.ai/v1/models?type=embedding" \\
  -H "Authorization: Bearer YOUR_API_KEY" \\
  | jq '.data[].id'
const response = await fetch('https://gab.ai/v1/embeddings', {
  method: 'POST',
  headers: {
    'Content-Type': 'application/json',
    'Authorization': 'Bearer YOUR_API_KEY'
  },
  body: JSON.stringify({
    model: 'YOUR_EMBEDDING_MODEL',
    input: 'The quick brown fox jumps over the lazy dog'
  })
});
from openai import OpenAI

client = OpenAI(
    api_key="YOUR_API_KEY",
    base_url="https://gab.ai/v1"
)

response = client.embeddings.create(
    model="YOUR_EMBEDDING_MODEL",
    input=["First document", "Second document"]
)

for item in response.data:
    print(f"Index {item.index}: {len(item.embedding)} dimensions")
{
  "object": "list",
  "data": [
    {
      "object": "embedding",
      "embedding": [0.0023, -0.0091, 0.0152, ...],
      "index": 0
    }
  ],
  "model": "YOUR_EMBEDDING_MODEL",
  "usage": {
    "prompt_tokens": 9,
    "total_tokens": 9,
    "credits_used": 1
  }
}

Currently no enabled embedding models

We don't have any embedding models enabled in production at the moment, so this endpoint will return 400 invalid_model until we re-enable one. Check GET /v1/models?type=embedding for the authoritative list — when it returns a non-empty data array, use one of those id values as the model field.

Batching

Pass an array of strings as input to embed multiple texts in a single request (up to 2048 items). This is significantly faster than making individual requests.

curl https://gab.ai/v1/credits \\
  -H "Authorization: Bearer YOUR_API_KEY"
{
  "object": "credit_balance",
  "monthly_credits": 2000,
  "monthly_used": 450,
  "monthly_remaining": 1550,
  "purchased_credits": 500,
  "total_available": 2050,
  "resets_at": "2026-02-01T00:00:00Z"
}

Credit Types

Monthly credits reset at the start of each billing cycle. Purchased credits never expire and are used after monthly credits are exhausted.

# Last 100 requests
curl https://gab.ai/v1/usage \\
  -H "Authorization: Bearer YOUR_API_KEY"

# Poll for new activity since the last tx_id you've seen
curl "https://gab.ai/v1/usage?after=65f2a1b3c4d5e6f7a8b9c0d1" \\
  -H "Authorization: Bearer YOUR_API_KEY"

# All chat.completions usage for March 2026
curl "https://gab.ai/v1/usage?start_date=2026-03-01&end_date=2026-03-31&endpoint=/v1/chat/completions&limit=1000" \\
  -H "Authorization: Bearer YOUR_API_KEY"
{
  "object": "list",
  "data": [
    {
      "id": "65f2a1b3c4d5e6f7a8b9c0d1",
      "object": "usage_record",
      "created": 1712345678,
      "endpoint": "/v1/chat/completions",
      "model": "arya",
      "credits_used": 1,
      "tokens": {
        "prompt": 842,
        "completion": 156,
        "total": 998
      },
      "context_tokens": 842,
      "response_time_ms": 1245,
      "status_code": 200,
      "success": true
    }
  ],
  "has_more": false,
  "first_id": "65f2a1b3c4d5e6f7a8b9c0d1",
  "last_id": "65f2a1b3c4d5e6f7a8b9c0d1"
}

Polling pattern

To efficiently tail new activity, persist the first_id from the previous response (the newest tx_id in that page) and pass it as after on the next call. Do not use last_id here: that field is the oldest item in the page, since results are returned newest-first.

# Get all models
curl https://gab.ai/v1/models \\
  -H "Authorization: Bearer YOUR_API_KEY"

# Get only image models
curl "https://gab.ai/v1/models?type=image" \\
  -H "Authorization: Bearer YOUR_API_KEY"
{
  "object": "list",
  "data": [
    {
      "id": "arya",
      "object": "model",
      "created": 1700000000,
      "owned_by": "gab-ai",
      "capabilities": {
        "text": true,
        "images": false,
        "video": false,
        "audio": false,
        "streaming": true,
        "thinking": false,
        "web_search": true,
        "function_calling": true,
        "embeddings": false,
        "image_input": true,
        "file_input": true,
        "audio_input": false,
        "video_input": false
      },
      "context_window": 128000,
      "max_output_tokens": 8192,
      "credit_cost": { "base_cost": 1, "context_threshold": 20000 },
      "is_plus_only": false
    },
    {
      "id": "claude-opus-4-7",
      "object": "model",
      "created": 1700000000,
      "owned_by": "anthropic",
      "capabilities": {
        "text": true,
        "streaming": true,
        "thinking": true,
        "function_calling": true,
        "image_input": true,
        "file_input": true
      },
      "context_window": 200000,
      "max_output_tokens": 8192,
      "credit_cost": { "base_cost": 5, "context_threshold": 20000 },
      "is_plus_only": true
    },
    {
      "id": "gpt-image-2",
      "object": "model",
      "created": 1700000000,
      "owned_by": "openai",
      "capabilities": {
        "images": true
      },
      "credit_cost": { "base_cost": 15, "context_threshold": 0 },
      "is_plus_only": true
    }
    // ... more models
  ]
}

Model capabilities

The capabilities object lists boolean flags for every feature the model supports. Notable flags for chat: text, streaming, function_calling, thinking, web_search, image_input, file_input. Pass tools only to models with function_calling: true — otherwise the request fails with unsupported_tool_calling.

const response = await fetch('https://gab.ai/v1/images/generations', {
  method: 'POST',
  headers: {
    'Content-Type': 'application/json',
    'Authorization': 'Bearer YOUR_API_KEY'
  },
  body: JSON.stringify({
    model: 'gpt-image-2',
    prompt: 'A serene mountain landscape at sunset with a lake reflection',
    n: 1,
    size: '1024x1024',
    quality: 'hd'
  })
});
{
  "created": 1704067200,
  "data": [
    {
      "url": "https://cdn.gab.ai/images/generated/abc123.png",
      "revised_prompt": "A serene mountain landscape at sunset..."
    }
  ],
  "usage": {
    "credits_used": 5
  }
}

Image Models

Use /v1/models?type=image to see all available image generation models, their capabilities, and max_prompt_characters per model. Long structured text (recipes, articles) often exceeds smaller models — summarize to a short visual description or use gpt-image-2.

curl -X POST https://gab.ai/v1/files \\
  -H "Authorization: Bearer YOUR_API_KEY" \\
  -F "purpose=vision" \\
  -F "file=@./photo.png"
const form = new FormData();
form.append('purpose', 'vision');
form.append('file', fileBlob, 'photo.png');

const res = await fetch('https://gab.ai/v1/files', {
  method: 'POST',
  headers: { 'Authorization': 'Bearer YOUR_API_KEY' },
  body: form,
});
const uploaded = await res.json();
// uploaded.id — use this as file_id in /v1/chat/completions
from openai import OpenAI

client = OpenAI(api_key="YOUR_API_KEY", base_url="https://gab.ai/v1")

uploaded = client.files.create(
    file=open("photo.png", "rb"),
    purpose="vision",
)

chat = client.chat.completions.create(
    model="arya",
    messages=[{
        "role": "user",
        "content": [
            {"type": "text", "text": "What's in this image?"},
            {"type": "image_url", "image_url": {"file_id": uploaded.id}},
        ],
    }],
)
print(chat.choices[0].message.content)
{
  "id": "67a1b2c3d4e5f6a7b8c9d0e1",
  "object": "file",
  "bytes": 184523,
  "created_at": 1704067200,
  "filename": "photo.png",
  "purpose": "vision",
  "mime_type": "image/png",
  "file_type": "image",
  "url": "https://cdn.gab.ai/users/abc/files/photo.png",
  "status": "active"
}
curl "https://gab.ai/v1/files?purpose=vision&limit=50" \\
  -H "Authorization: Bearer YOUR_API_KEY"
curl https://gab.ai/v1/files/FILE_ID \\
  -H "Authorization: Bearer YOUR_API_KEY"
curl -X DELETE https://gab.ai/v1/files/FILE_ID \\
  -H "Authorization: Bearer YOUR_API_KEY"
{ "id": "FILE_ID", "object": "file", "deleted": true }

Limits

Per-file upload limit is 50 MB. Individual types have their own caps in downstream processing (e.g., Whisper transcription is capped at 25 MB). Image types accepted: PNG, JPEG, GIF, WebP, HEIC, SVG.

const response = await fetch('https://gab.ai/v1/videos/generations', {
  method: 'POST',
  headers: {
    'Content-Type': 'application/json',
    'Authorization': 'Bearer YOUR_API_KEY'
  },
  body: JSON.stringify({
    model: 'veo-3-1-fast',
    prompt: 'A golden retriever running through a field of flowers',
    duration: 5,
    aspect_ratio: '16:9'
  })
});
{
  "id": "video_job_xyz789",
  "object": "video.generation",
  "status": "processing",
  "created": 1704067200,
  "model": "veo-3-1-fast",
  "prompt": "A golden retriever running through a field of flowers",
  "usage": {
    "credits_used": 20
  }
}
curl https://gab.ai/v1/videos/video_job_xyz789 \\
  -H "Authorization: Bearer YOUR_API_KEY"
{
  "id": "video_job_xyz789",
  "object": "video.generation",
  "status": "completed",
  "created": 1704067200,
  "completed_at": 1704067320,
  "model": "veo-3-1-fast",
  "data": [
    {
      "url": "https://cdn.gab.ai/videos/generated/xyz789.mp4",
      "duration": 5,
      "thumbnail": "https://cdn.gab.ai/videos/thumbnails/xyz789.jpg"
    }
  ]
}

Processing Time

Video generation typically takes 1-3 minutes depending on duration and model. Implement exponential backoff when polling to avoid rate limits.

const response = await fetch('https://gab.ai/v1/audio/speech', {
  method: 'POST',
  headers: {
    'Content-Type': 'application/json',
    'Authorization': 'Bearer YOUR_API_KEY'
  },
  body: JSON.stringify({
    model: 'gpt-4o-mini-tts',
    input: 'Hello! Welcome to Gab AI. How can I help you today?',
    voice: 'nova',
    response_format: 'mp3',
    speed: 1.0
  })
});

const { url } = await response.json();
// Fetch the generated audio
const audio = await fetch(url).then(r => r.blob());
const audioUrl = URL.createObjectURL(audio);
import requests

resp = requests.post(
    "https://gab.ai/v1/audio/speech",
    headers={"Authorization": "Bearer YOUR_API_KEY"},
    json={
        "model": "gpt-4o-mini-tts",
        "voice": "nova",
        "input": "Hello! Welcome to Gab AI.",
    },
).json()

# Download the generated audio
with open("output.mp3", "wb") as f:
    f.write(requests.get(resp["url"]).content)
{
  "url": "https://cdn.gab.ai/audio/generated/abc123.mp3",
  "content_type": "audio/mp3",
  "usage": { "credits_used": 1 }
}

Differs from OpenAI's SDK

OpenAI's client.audio.speech.create(...) expects raw audio bytes back, so calling Gab AI through the OpenAI Python SDK with response.stream_to_file() will not work. Use raw HTTP (or fetch the returned url after parsing JSON) instead.

curl https://gab.ai/v1/api-keys \\
  -H "Authorization: Bearer YOUR_API_KEY"
{
  "object": "list",
  "data": [
    {
      "id": "65f2a1b3c4d5e6f7a8b9c0d1",
      "object": "api_key",
      "key": "gab_5e17...d85f",
      "name": "Production Key",
      "is_active": true,
      "created": 1704067200,
      "last_used": 1704672000,
      "usage": { "requests": 4815, "tokens": 162342 }
    },
    {
      "id": "65f2a1b3c4d5e6f7a8b9c0d2",
      "object": "api_key",
      "key": "gab_a1b2...c3d4",
      "name": "Development Key",
      "is_active": true,
      "created": 1704153600,
      "last_used": null,
      "usage": { "requests": 0, "tokens": 0 }
    }
  ]
}
curl -X POST https://gab.ai/v1/api-keys \\
  -H "Authorization: Bearer YOUR_API_KEY" \\
  -H "Content-Type: application/json" \\
  -d '{"name": "New Production Key"}'
{
  "id": "65f2a1b3c4d5e6f7a8b9c0d3",
  "object": "api_key",
  "key": "gab-5e17f695110d1e02c90e4537445d85fce4f5...",
  "name": "New Production Key",
  "is_active": true,
  "created": 1704067200,
  "message": "Save this key securely. It will not be shown again."
}
curl -X DELETE https://gab.ai/v1/api-keys/65f2a1b3c4d5e6f7a8b9c0d1 \\
  -H "Authorization: Bearer YOUR_API_KEY"
{
  "id": "65f2a1b3c4d5e6f7a8b9c0d1",
  "object": "api_key",
  "deleted": true
}

Key prefix

Keys created via this endpoint use the prefix gab- (followed by 64 hex chars). Keys created via the in-app Settings UI use the prefix gab_ (followed by 32 hex chars). Both forms are valid as Authorization: Bearer tokens — treat the entire string as opaque.

Save Your Key

The full API key is only returned once when created. Store it securely—you won't be able to retrieve it again.

Self-deletion blocked

You cannot delete the API key being used to authenticate the request — the server returns 400 cannot_delete_current_key. Use a different active key (or the dashboard) to revoke it. Deletion is a soft delete: the key is marked inactive and rejected from then on.

# JSON bundle (default)
curl https://gab.ai/v1/account/export \\
  -H "Authorization: Bearer YOUR_API_KEY" \\
  -o gab-ai-account-data.json

# ZIP archive (same shape the Settings UI downloads)
curl "https://gab.ai/v1/account/export?format=zip" \\
  -H "Authorization: Bearer YOUR_API_KEY" \\
  -o gab-ai-account-data.zip

# Slim JSON — metadata only, no message bodies
curl "https://gab.ai/v1/account/export?include_messages=false" \\
  -H "Authorization: Bearer YOUR_API_KEY" \\
  -o gab-ai-account-data-slim.json
{
  "object": "account_data_export",
  "exported_at": "2026-05-02T18:00:00.000Z",
  "user_id": "65f2a1b3c4d5e6f7a8b9c0d1",
  "counts": {
    "conversations": 412,
    "memories": 87,
    "files": 23,
    "agents": 4,
    "collections": 6,
    "voice_sessions": 18,
    "purchases": 3,
    "credit_purchases": 11,
    "referrals": 2,
    "feedback": 1,
    "inference_tasks": 0,
    "bookmark_folders": 5,
    "api_keys": 2
  },
  "truncated": {},
  "data": {
    "user": { "_id": "...", "email": "...", "username": "...", "...": "..." },
    "conversations": [ /* full conversation rows with messages */ ],
    "memories": [ /* memory rows (embeddings stripped) */ ],
    "files": [ /* file metadata + cdn paths */ ],
    "agents": [ /* your custom agents */ ],
    "collections": [ /* your collections */ ],
    "voice_sessions": [ /* voice mode session metadata */ ],
    "purchases": [ /* Plus subscription purchases */ ],
    "credit_purchases": [ /* one-time credit packs */ ],
    "referrals": [ /* referrals you sent or received */ ],
    "feedback": [ /* feedback you've submitted */ ],
    "inference_tasks": [ /* scheduled / recurring tasks */ ],
    "bookmark_folders": [ /* bookmark organization */ ],
    "api_keys": [ /* metadata only — secret values are never re-exported */ ]
  }
}
gab-ai-account-data-2026-05-02.zip
├── README.md                  # human-readable overview, counts, layout
├── manifest.json              # exported_at, counts, truncated, schema_version
├── profile.json               # your account profile
├── conversations.json         # raw bundle of every conversation
├── conversations/             # one .md file per conversation, easy to read
│   ├── 2026-05-01-untitled-conversation-abc123.md
│   └── ...
├── memories.json              # raw memory bundle
├── memories.txt               # one memory per line, easy to grep
├── files.json
├── agents.json
├── collections.json
├── voice_sessions.json
├── bookmark_folders.json
├── inference_tasks.json
├── purchases.json
├── credit_purchases.json
├── referrals.json
├── feedback.json
└── api_keys.json              # metadata only

What's excluded

Deleted records and temporary ("incognito") chats are filtered out before we build the archive. We also strip credentials (passwords, password reset tokens, MFA secrets, backup codes), payment-processor IDs (Authorize.net / Valmar customer + payment-method IDs), memory embedding vectors, files attached to temporary chats, and the secret value of API keys. Everything else that belongs to your account is in the export.

Truncation

Each collection is capped at a generous limit (5,000 conversations, 10,000 files, etc.). If a collection trips its cap the JSON response includes the key in the top-level truncated object (and the ZIP's README spells it out). Email support@gab.ai for a complete offline archive if you need the full history.

Rate limits

The export endpoint is intentionally heavy. The same daily API rate limit applies, but in practice you should only need a handful of calls — cache the result locally rather than polling.

{
  "error": {
    "message": "Insufficient credits to complete request",
    "type": "insufficient_credits",
    "code": "credits_exhausted",
    "param": null
  }
}

Handling Rate Limits

When you receive a 429 error, check the X-RateLimit-Reset header to know when you can resume making requests.