Skip to main content
POST
/
v1
/
chat
/
completions
cURL
curl --request POST \
  --url https://api.perceptron.inc/v1/chat/completions \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "messages": [
    {
      "content": "<string>",
      "role": "system"
    }
  ],
  "model": "<string>",
  "frequency_penalty": 0,
  "max_completion_tokens": 1,
  "presence_penalty": 0,
  "response_format": "<unknown>",
  "stream": false,
  "stream_options": "<unknown>",
  "temperature": 1,
  "top_p": 1
}
'
{
  "choices": [
    {
      "index": 1,
      "message": {
        "role": "system",
        "content": "<string>",
        "reasoning_content": "<string>"
      },
      "finish_reason": "<unknown>"
    }
  ],
  "created": 1,
  "id": "<string>",
  "model": "<string>",
  "object": "<string>",
  "usage": "<unknown>"
}

Overview

The Chat Completions API is fully compatible with OpenAI’s chat completions specification, supporting both text-only and multimodal (vision) requests. Use it to generate responses from Perceptron’s vision-language models.

Authentication

All requests require an Authorization header with your API key:
Authorization: Bearer YOUR_API_KEY

Vision Hints

Perceptron models support optional hints in system messages to optimize performance for specific vision tasks:
  • <hint>BOX</hint> - Optimizes the model for bounding box detection tasks
  • <hint>POINT</hint> - Optimizes the model for point/keypoint detection tasks
  • <hint>POLYGON</hint> - Optimizes the model for polygon/segmentation tasks
Include hints in the system message content field to guide model behavior.

Examples

Visual Question Answering with BOX Hint

Use the <hint>BOX</hint> to optimize for grounded question answering with bounding boxes:
curl --location 'https://api.perceptron.inc/v1/chat/completions' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer YOUR_API_KEY' \
--data '{
    "model": "isaac-0.2-2b-preview",
    "messages": [
        {
            "role": "system",
            "content": "<hint>BOX</hint>"
        },
        {
            "role": "user",
            "content": [
                {
                    "type": "image_url",
                    "image_url": {
                        "url": "https://raw.githubusercontent.com/perceptron-ai-inc/perceptron/main/cookbook/_shared/assets/capabilities/qna/studio_scene.webp"
                    }
                },
                {
                    "type": "text",
                    "text": "Determine the focal point of this scene."
                }
            ]
        }
    ],
    "temperature": 0.0,
    "stream": false
}'

Object Detection (PPE Detection)

Detect safety equipment like helmets and vests in workplace environments:
curl --location 'https://api.perceptron.inc/v1/chat/completions' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer YOUR_API_KEY' \
--data '{
    "model": "isaac-0.2-2b-preview",
    "messages": [
        {
            "role": "system",
            "content": "<hint>BOX</hint>"
        },
        {
            "role": "user",
            "content": [
                {
                    "type": "image_url",
                    "image_url": {
                        "url": "https://raw.githubusercontent.com/perceptron-ai-inc/perceptron/main/cookbook/_shared/assets/capabilities/detection/ppe_line.webp"
                    }
                },
                {
                    "type": "text",
                    "text": "Find every worker wearing PPE. Focus on helmets and high-visibility vests. Return one bounding box per instance."
                }
            ]
        }
    ],
    "temperature": 0.0,
    "stream": false
}'

OCR with Custom Prompts

Extract structured text from images with domain-specific prompts:
curl --location 'https://api.perceptron.inc/v1/chat/completions' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer YOUR_API_KEY' \
--data '{
    "model": "isaac-0.2-2b-preview",
    "messages": [
        {
            "role": "user",
            "content": [
                {
                    "type": "image_url",
                    "image_url": {
                        "url": "https://raw.githubusercontent.com/perceptron-ai-inc/perceptron/main/cookbook/_shared/assets/capabilities/ocr/grocery_labels.webp"
                    }
                },
                {
                    "type": "text",
                    "text": "Extract each produce label along with its listed price."
                }
            ]
        }
    ],
    "temperature": 0.0,
    "stream": false
}'

Detailed Image Captioning

Generate rich descriptions with spatial grounding for scene understanding:
curl --location 'https://api.perceptron.inc/v1/chat/completions' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer YOUR_API_KEY' \
--data '{
    "model": "isaac-0.2-2b-preview",
    "messages": [
        {
            "role": "system",
            "content": "<hint>BOX</hint>"
        },
        {
            "role": "user",
            "content": [
                {
                    "type": "image_url",
                    "image_url": {
                        "url": "https://raw.githubusercontent.com/perceptron-ai-inc/perceptron/main/cookbook/_shared/assets/capabilities/caption/solar_array.webp"
                    }
                },
                {
                    "type": "text",
                    "text": "Provide a detailed caption describing the key elements in this image."
                }
            ]
        }
    ],
    "temperature": 0.0,
    "stream": false
}'

Streaming Response

Enable streaming for real-time token generation:
curl --location 'https://api.perceptron.inc/v1/chat/completions' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer YOUR_API_KEY' \
--data '{
    "model": "isaac-0.2-2b-preview",
    "messages": [
        {
            "role": "user",
            "content": [
                {
                    "type": "image_url",
                    "image_url": {
                        "url": "https://raw.githubusercontent.com/perceptron-ai-inc/perceptron/main/cookbook/_shared/assets/capabilities/caption/suburban_street.webp"
                    }
                },
                {
                    "type": "text",
                    "text": "Describe this scene in detail."
                }
            ]
        }
    ],
    "temperature": 0.0,
    "stream": true,
    "stream_options": {
        "include_usage": true
    }
}'
When streaming is enabled, the API returns Server-Sent Events (SSE). The final event includes usage statistics when include_usage is true.

Model-Specific Recommendations

Isaac 0.1

  • Temperature: Default and recommended value is 0.0 for optimal performance and deterministic outputs
  • Best for: Vision tasks, spatial reasoning, object detection

Qwen3VL (qwen3-vl-235b-a22b-thinking)

  • Temperature: Recommended value is 0.7 for balanced creativity and coherence
  • Best for: Complex reasoning, detailed descriptions, multi-step analysis

Best Practices

  1. Use hints for spatial tasks: Include appropriate hints (BOX, POINT, POLYGON) in system messages when performing detection or localization
  2. Temperature tuning: Use 0.0 for deterministic outputs (Isaac 0.1), higher values (0.7) for creative tasks
  3. Image format: Both HTTP(S) URLs and base64 data URLs are supported
  4. Token limits: Monitor max_completion_tokens to control response length and costs

Error Handling

Common error responses:
  • 400 Bad Request: Invalid request parameters (e.g., negative temperature value)
  • 401 Unauthorized: Invalid or missing API key
  • 402 Payment Required: Organization out of credits
  • 413 Payload Too Large: Request exceeds size limits (reduce image sizes or count)
  • 422 Unprocessable Entity: Request body failed validation or deserialization (e.g., missing required fields)
  • 429 Too Many Requests: Rate limit exceeded, implement exponential backoff
  • 500/502/503: Server errors, retry with exponential backoff
See the API reference below for complete request/response schemas.

Authorizations

Authorization
string
header
required

Bearer token authentication using your Perceptron API key

Body

application/json
messages
object[]
required

Conversation history listed in order. Supported roles: system, user, assistant.

Author role of the message as defined by the OpenAI Chat Completions spec.

model
string
required

The model to invoke. Available options: isaac-0.1, qwen3-vl-235b-a22b-thinking.

frequency_penalty
number<float> | null
default:0

Positive values discourage the model from repeating previously used tokens.

Required range: -2 <= x <= 2
max_completion_tokens
integer<int32> | null

Maximum number of completion tokens to generate.

Model-specific limits:

  • Isaac 0.1: The combined total of input tokens and output tokens must not exceed 8192 tokens.
Required range: x >= 0
presence_penalty
number<float> | null
default:0

Positive values encourage the model to introduce new concepts.

Required range: -2 <= x <= 2
response_format
object

An object specifying the format that the model must output. Setting to { "type": "json_schema", "json_schema": {...} } enables Structured Outputs which ensures the model will match your supplied JSON schema.

stream
boolean | null
default:false

Set to true for SSE streaming. When omitted, the API returns a single JSON response.

stream_options
object

Optional streaming flags. The gateway always sets include_usage = true so usage/tokens are emitted at stream end.

temperature
number<float> | null

Sampling temperature. Lower values yield deterministic replies; higher values explore more creative outputs.

Model-specific recommendations:

  • Isaac 0.1: Default and recommended value is 0.0.
  • Qwen3-VL: Recommended value is 0.7.
Required range: 0 <= x <= 2
top_p
number<float> | null
default:1

Nucleus sampling probability. The model samples from the smallest token set whose cumulative probability exceeds this threshold.

Required range: 0 < x <= 1

Response

Chat completion generated successfully.

Non-streaming response body when stream=false.

choices
object[]
required
created
integer<int64>
required
Required range: x >= 0
id
string
required
model
string
required
object
string
required
usage
object

Token accounting emitted with every completion.