Skip to main content
POST
/
v1
/
chat
/
completions
cURL
curl --request POST \
  --url https://api.perceptron.inc/v1/chat/completions \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "model": "perceptron-mk1",
  "messages": [
    {
      "role": "user",
      "content": [
        {
          "type": "video_url",
          "video_url": {
            "url": "https://raw.githubusercontent.com/perceptron-ai-inc/perceptron/main/cookbook/_shared/assets/tutorials/isaac_frame_by_frame/surf.mp4"
          }
        },
        {
          "type": "text",
          "text": "What happens in this video?"
        }
      ]
    }
  ],
  "vision_config": {
    "enable_thinking": true
  }
}
'
{
  "choices": [
    {
      "index": 1,
      "message": {
        "content": "<string>",
        "reasoning_content": "<string>"
      }
    }
  ],
  "created": 1,
  "id": "<string>",
  "model": "<string>",
  "object": "<string>",
  "usage": {
    "completion_tokens": 1,
    "prompt_tokens": 1,
    "total_tokens": 1
  }
}

Documentation Index

Fetch the complete documentation index at: https://docs.perceptron.inc/llms.txt

Use this file to discover all available pages before exploring further.

Overview

The Chat Completions API is fully compatible with OpenAI’s chat completions specification, supporting both text-only and multimodal (vision) requests. Use it to generate responses from Perceptron Mk1. Perceptron Mk1 triggers thinking and structured grounding through the typed vision_config body field.

vision_config

For perceptron-mk1, pass a top-level vision_config object alongside messages:
FieldValuesPurpose
annotation_format"point" / "box" / "polygon" / "clip"Grounded output format. clip is video-only.
enable_thinkingtrue / falseChain-of-thought reasoning.
internal_tools.focustrue / falseEnable the focus tool — model can zoom into regions. Image only.

When to enable_thinking

  • On for text Q&A, captioning, OCR, and video clipping (annotation_format: "clip").
  • Off for spatial detection (annotation_format in "point", "box", "polygon").

Example: Grounded detection

curl https://api.perceptron.inc/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $PERCEPTRON_API_KEY" \
  -d '{
    "model": "perceptron-mk1",
    "messages": [
      { "role": "user",
        "content": [
          { "type": "image_url",
            "image_url": { "url": "<image-url>" } },
          { "type": "text",
            "text": "Find every worker wearing PPE." }
        ]
      }
    ],
    "vision_config": { "annotation_format": "box" }
  }'

Example: Video clipping

curl https://api.perceptron.inc/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $PERCEPTRON_API_KEY" \
  -d '{
    "model": "perceptron-mk1",
    "messages": [
      { "role": "user",
        "content": [
          { "type": "video_url",
            "video_url": { "url": "<video-url>" } },
          { "type": "text",
            "text": "Clip the moment the worker scans the package." }
        ]
      }
    ],
    "vision_config": { "annotation_format": "clip", "enable_thinking": true }
  }'

Example: Image reasoning with focus

curl https://api.perceptron.inc/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $PERCEPTRON_API_KEY" \
  -d '{
    "model": "perceptron-mk1",
    "messages": [
      { "role": "user",
        "content": [
          { "type": "image_url",
            "image_url": { "url": "<image-url>" } },
          { "type": "text",
            "text": "What is the serial number on the device in the corner?" }
        ]
      }
    ],
    "vision_config": {
      "enable_thinking": true,
      "internal_tools": { "focus": true }
    }
  }'

Streaming

Set "stream": true to receive Server-Sent Events (SSE). To get token usage, also set stream_options.include_usage: true — when enabled, usage is attached to the final chunk (the one with finish_reason: "stop"), immediately before data: [DONE].
curl https://api.perceptron.inc/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $PERCEPTRON_API_KEY" \
  -d '{
    "model": "perceptron-mk1",
    "messages": [
      { "role": "user",
        "content": [
          { "type": "image_url",
            "image_url": { "url": "<image-url>" } },
          { "type": "text", "text": "Describe this scene in detail." }
        ]
      }
    ],
    "vision_config": { "enable_thinking": true },
    "stream": true,
    "stream_options": { "include_usage": true }
  }'

Best Practices

  1. Thinking pairs well with text and clipping; not with spatial detection. Turn enable_thinking on for text Q&A, captioning, OCR, and annotation_format: "clip". Turn it off for "point", "box", and "polygon".
  2. Leave temperature unset. The default is 0.0 (deterministic). Only set a non-zero value if you want more varied outputs.
  3. Image format: HTTP(S) URLs and base64 data URLs are both supported. MIME types: image/png, image/jpeg, image/webp, video/mp4, video/webm.
  4. Token limits: 32K context, 8K output.

Limits

LimitValue
Requests300/min
Request body size20 MB
Media upload20 GB per 48 hours
For large images, resize client-side before uploading. See the Tokenization guide for optimization tips.

Authorizations

Authorization
string
header
default:Bearer $PERCEPTRON_API_KEY
required

Bearer token authentication using your Perceptron API key

Body

application/json
messages
(User message · object | System message · object | Assistant message · object)[]
required

Conversation history listed in order. Supported roles: system, user, assistant.

Author role of the message as defined by the OpenAI Chat Completions spec.

Example:
[
{
"role": "user",
"content": [
{
"type": "video_url",
"video_url": {
"url": "https://raw.githubusercontent.com/perceptron-ai-inc/perceptron/main/cookbook/_shared/assets/tutorials/isaac_frame_by_frame/surf.mp4"
}
},
{
"type": "text",
"text": "What happens in this video?"
}
]
}
]
model
string
default:perceptron-mk1
required

The model to invoke. Available options: perceptron-mk1, isaac-0.2-2b-preview, isaac-0.2-1b, isaac-0.1.

Example:

"perceptron-mk1"

frequency_penalty
number<float> | null

Positive values discourage the model from repeating previously used tokens.

Required range: -2 <= x <= 2
max_completion_tokens
integer<int32> | null

Maximum number of completion tokens to generate. Perceptron Mk1 has a 32K context window with an 8K output cap.

Required range: x >= 0
Example:

1024

presence_penalty
number<float> | null

Positive values encourage the model to introduce new concepts.

Required range: -2 <= x <= 2
regex
string | null

Regex pattern for constrained generation.

response_format
object

An object specifying the format that the model must output. Setting to { "type": "json_schema", "json_schema": {...} } enables Structured Outputs which ensures the model will match your supplied JSON schema.

stream
boolean | null
default:false

Set to true for SSE streaming. When omitted, the API returns a single JSON response.

stream_options
object

Optional streaming flags. Set include_usage: true to receive token counts in the final stream chunk.

temperature
number<float> | null
default:0

Sampling temperature. Lower values yield deterministic replies; higher values explore more creative outputs.

Server default is 0.0 for all Perceptron models (perceptron-mk1, isaac-0.1, isaac-0.2-1b, isaac-0.2-2b-preview). Omit the field unless you specifically want non-deterministic sampling.

Required range: 0 <= x <= 2
top_k
integer<int32> | null

Top-k sampling. The model samples from the top k most likely tokens.

Required range: x >= 0
top_p
number<float> | null

Nucleus sampling probability. The model samples from the smallest token set whose cumulative probability exceeds this threshold.

Required range: x <= 1
vision_config
object

Perceptron vision-model controls (thinking, spatial output format, internal-tool toggles). Recommended for perceptron-mk1. Isaac-series models (isaac-0.1, isaac-0.2-1b, isaac-0.2-2b-preview) use <hint>...</hint> system messages instead.

Response

Chat completion generated successfully.

Non-streaming response body when stream=false.

choices
object[]
required
created
integer<int64>
required
Required range: x >= 0
id
string
required
model
string
required
object
string
required
usage
object

Token accounting emitted with every completion.