Overview
Thinking exposes the model’s chain-of-thought. The SDK extracts<think>...</think> spans and returns them under reasoning while keeping the final text clean. Reasoning is opt-in unless the model is marked as “thinking”.
When reasoning runs
- Set
reasoning=Trueto turn on reasoning for anyperceivecall or CLI run. - Models that disable reasoning (for example
isaac-0.1) drop the flag and warn instead of erroring. - Without
reasoning=True, calls return answers with no<think>trace. - Note that
qwen3-vl-235b-a22b-thinkingalways runs with reasoning;reasoning=Falseis ignored with a warning.
How to request reasoning
- Reasoning flag:
@perceive(..., reasoning=True)or for streaming,perceive(..., reasoning=True, stream=True)(defaults to text outputs). - High-level helpers: pass the same flag, e.g.
caption(..., reasoning=True),detect(..., reasoning=True), orquestion(..., reasoning=True).
Streaming usage (Python)
Non-streaming usage (Python)
Event and result shape
- Streaming emits
text.deltachunks; reasoning appears first wrapped in<think>...</think>, followed by answer text. There is no separatereasoning.deltachannel. - The final payload is
{ "type": "final", "result": { ... } }whereresult["reasoning"]is a list of extracted reasoning strings andresult["text"]omits the<think>tags. - Structured helpers (
detect,ocr,caption,question, etc.) pass through the same reasoning flag and include traces in their final results.
Tips
- Use low temperature for repeatable traces.
- Request reasoning only when you need it; thinking models already enable it automatically.
- Capture the final
reasoninglist for logging instead of scraping<think>tags yourself.