Chat Completions
Documentation Index
Fetch the complete documentation index at: https://docs.perceptron.inc/llms.txt
Use this file to discover all available pages before exploring further.
Overview
The Chat Completions API is fully compatible with OpenAI’s chat completions specification, supporting both text-only and multimodal (vision) requests. Use it to generate responses from Perceptron Mk1. Perceptron Mk1 triggers thinking and structured grounding through the typedvision_config body field.
vision_config
For perceptron-mk1, pass a top-level vision_config object alongside messages:
| Field | Values | Purpose |
|---|---|---|
annotation_format | "point" / "box" / "polygon" / "clip" | Grounded output format. clip is video-only. |
enable_thinking | true / false | Chain-of-thought reasoning. |
internal_tools.focus | true / false | Enable the focus tool — model can zoom into regions. Image only. |
When to enable_thinking
- On for text Q&A, captioning, OCR, and video clipping (
annotation_format: "clip"). - Off for spatial detection (
annotation_formatin"point","box","polygon").
Example: Grounded detection
Example: Video clipping
Example: Image reasoning with focus
Streaming
Set"stream": true to receive Server-Sent Events (SSE). To get token usage, also set stream_options.include_usage: true — when enabled, usage is attached to the final chunk (the one with finish_reason: "stop"), immediately before data: [DONE].
Best Practices
- Thinking pairs well with text and clipping; not with spatial detection. Turn
enable_thinkingon for text Q&A, captioning, OCR, andannotation_format: "clip". Turn it off for"point","box", and"polygon". - Leave
temperatureunset. The default is0.0(deterministic). Only set a non-zero value if you want more varied outputs. - Image format: HTTP(S) URLs and base64 data URLs are both supported. MIME types:
image/png,image/jpeg,image/webp,video/mp4,video/webm. - Token limits: 32K context, 8K output.
Limits
| Limit | Value |
|---|---|
| Requests | 300/min |
| Request body size | 20 MB |
| Media upload | 20 GB per 48 hours |
Authorizations
Bearer token authentication using your Perceptron API key
Body
Conversation history listed in order. Supported roles: system, user, assistant.
Author role of the message as defined by the OpenAI Chat Completions spec.
- User message
- System message
- Assistant message
[
{
"role": "user",
"content": [
{
"type": "video_url",
"video_url": {
"url": "https://raw.githubusercontent.com/perceptron-ai-inc/perceptron/main/cookbook/_shared/assets/tutorials/isaac_frame_by_frame/surf.mp4"
}
},
{
"type": "text",
"text": "What happens in this video?"
}
]
}
]The model to invoke. Available options: perceptron-mk1, isaac-0.2-2b-preview, isaac-0.2-1b, isaac-0.1.
"perceptron-mk1"
Positive values discourage the model from repeating previously used tokens.
-2 <= x <= 2Maximum number of completion tokens to generate. Perceptron Mk1 has a 32K context window with an 8K output cap.
x >= 01024
Positive values encourage the model to introduce new concepts.
-2 <= x <= 2Regex pattern for constrained generation.
An object specifying the format that the model must output.
Setting to { "type": "json_schema", "json_schema": {...} } enables Structured Outputs
which ensures the model will match your supplied JSON schema.
- Option 1
- Option 2
Set to true for SSE streaming. When omitted, the API returns a single JSON response.
Optional streaming flags. Set include_usage: true to receive token counts in the final stream chunk.
Sampling temperature. Lower values yield deterministic replies; higher values explore more creative outputs.
Server default is 0.0 for all Perceptron models (perceptron-mk1, isaac-0.1, isaac-0.2-1b, isaac-0.2-2b-preview). Omit the field unless you specifically want non-deterministic sampling.
0 <= x <= 2Top-k sampling. The model samples from the top k most likely tokens.
x >= 0Nucleus sampling probability. The model samples from the smallest token set whose cumulative probability exceeds this threshold.
x <= 1Perceptron vision-model controls (thinking, spatial output format, internal-tool toggles). Recommended for perceptron-mk1. Isaac-series models (isaac-0.1, isaac-0.2-1b, isaac-0.2-2b-preview) use <hint>...</hint> system messages instead.