Chat Completions
Documentation Index
Fetch the complete documentation index at: https://docs.perceptron.inc/llms.txt
Use this file to discover all available pages before exploring further.
Overview
The Chat Completions API is fully compatible with OpenAI’s chat completions specification, supporting both text-only and multimodal (image) requests. Use it to generate responses from Isaac 0.2. Isaac 0.2 triggers thinking and structured grounding through<hint>...</hint> tags inside a system-role message.
<hint> system messages
Place hint values inside a system-role message. Multiple hints can share one <hint> tag, separated by spaces.
| Hint | Output |
|---|---|
<hint>BOX</hint> | Bounding boxes |
<hint>POINT</hint> | Points / keypoints |
<hint>POLYGON</hint> | Polygon masks |
<hint>THINK</hint> | Chain-of-thought reasoning |
<hint>FOCUS</hint> | Enable internal focus tool |
Example: Grounded detection
Example: Counting with grounding
For counting tasks or multi-step spatial reasoning, combiningTHINK with BOX (or POINT) is helpful on Isaac 0.2. For pure detection without counting, use the spatial hint alone.
Example: OCR without hints
For free-form text tasks like OCR, no hint is needed — just send your prompt.Streaming
Set"stream": true to receive Server-Sent Events (SSE). To get token usage, also set stream_options.include_usage: true — when enabled, usage is attached to the final chunk (the one with finish_reason: "stop"), immediately before data: [DONE].
Best Practices
- Combining
THINKwithBOX/POINTis helpful for counting. Use the spatial hint alone for pure detection; addTHINKwhen you need step-by-step reasoning alongside the bounding boxes. - Leave
temperatureunset. The default is0.0(deterministic). Only set a non-zero value if you want more varied outputs. - Image format: HTTP(S) URLs and base64 data URLs are both supported. MIME types:
image/png,image/jpeg,image/webp. - Token limits: 8K context.
Limits
| Limit | Value |
|---|---|
| Requests | 300/min |
| Request body size | 20 MB |
| Media upload | 20 GB per 48 hours |
Authorizations
Bearer token authentication using your Perceptron API key
Body
Conversation history listed in order. Supported roles: system, user, assistant.
Author role of the message as defined by the OpenAI Chat Completions spec.
- User message
- System message
- Assistant message
[
{
"role": "system",
"content": "<hint>THINK</hint>"
},
{
"role": "user",
"content": [
{
"type": "image_url",
"image_url": {
"url": "https://raw.githubusercontent.com/perceptron-ai-inc/perceptron/main/cookbook/_shared/assets/capabilities/qna/studio_scene.webp"
}
},
{
"type": "text",
"text": "What stands out in this scene?"
}
]
}
]The model to invoke. Available options: isaac-0.2-2b-preview, isaac-0.2-1b, isaac-0.1, perceptron-mk1.
"isaac-0.2-2b-preview"
Positive values discourage the model from repeating previously used tokens.
-2 <= x <= 2Maximum number of completion tokens to generate. Isaac 0.2 (1B and 2B Preview) has an 8K context window shared between input and output.
x >= 01024
Positive values encourage the model to introduce new concepts.
-2 <= x <= 2Regex pattern for constrained generation.
An object specifying the format that the model must output.
Setting to { "type": "json_schema", "json_schema": {...} } enables Structured Outputs
which ensures the model will match your supplied JSON schema.
- Option 1
- Option 2
Set to true for SSE streaming. When omitted, the API returns a single JSON response.
Optional streaming flags. Set include_usage: true to receive token counts in the final stream chunk.
Sampling temperature. Lower values yield deterministic replies; higher values explore more creative outputs.
Server default is 0.0 for all Perceptron models (perceptron-mk1, isaac-0.1, isaac-0.2-1b, isaac-0.2-2b-preview). Omit the field unless you specifically want non-deterministic sampling.
0 <= x <= 2Top-k sampling. The model samples from the top k most likely tokens.
x >= 0Nucleus sampling probability. The model samples from the smallest token set whose cumulative probability exceeds this threshold.
x <= 1Perceptron vision-model controls (thinking, spatial output format, internal-tool toggles). Recommended for perceptron-mk1. Isaac-series models (isaac-0.1, isaac-0.2-1b, isaac-0.2-2b-preview) use <hint>...</hint> system messages instead.