Authorization header with your API key:
<hint>BOX</hint> - Optimizes the model for bounding box detection tasks<hint>POINT</hint> - Optimizes the model for point/keypoint detection tasks<hint>POLYGON</hint> - Optimizes the model for polygon/segmentation taskscontent field to guide model behavior.
<hint>BOX</hint> to optimize for grounded question answering with bounding boxes:
include_usage is true.
0.0 for optimal performance and deterministic outputs0.7 for balanced creativity and coherenceBOX, POINT, POLYGON) in system messages when performing detection or localization0.0 for deterministic outputs (Isaac 0.1), higher values (0.7) for creative tasksmax_completion_tokens to control response length and costsBearer token authentication using your Perceptron API key
Conversation history listed in order. Supported roles: system, user, assistant.
Author role of the message as defined by the OpenAI Chat Completions spec.
The model to invoke. Available options: isaac-0.1, qwen3-vl-235b-a22b-thinking.
Positive values discourage the model from repeating previously used tokens.
-2 <= x <= 2Maximum number of completion tokens to generate.
Model-specific limits:
x >= 0Positive values encourage the model to introduce new concepts.
-2 <= x <= 2An object specifying the format that the model must output.
Setting to { "type": "json_schema", "json_schema": {...} } enables Structured Outputs
which ensures the model will match your supplied JSON schema.
Set to true for SSE streaming. When omitted, the API returns a single JSON response.
Optional streaming flags. The gateway always sets include_usage = true so usage/tokens are emitted at stream end.
Sampling temperature. Lower values yield deterministic replies; higher values explore more creative outputs.
Model-specific recommendations:
0.0.0.7.0 <= x <= 2Nucleus sampling probability. The model samples from the smallest token set whose cumulative probability exceeds this threshold.
0 < x <= 1