Skip to main content

Overview

Thinking exposes the model’s chain-of-thought. The SDK extracts <think>...</think> spans and returns them under reasoning while keeping the final text clean. Reasoning is opt-in unless the model is marked as “thinking”.

When reasoning runs

  • Set reasoning=True to turn on reasoning for any perceive call or CLI run.
  • Models that disable reasoning (for example isaac-0.1) drop the flag and warn instead of erroring.
  • Without reasoning=True, calls return answers with no <think> trace.
  • Note that qwen3-vl-235b-a22b-thinking always runs with reasoning; reasoning=False is ignored with a warning.

How to request reasoning

  • Reasoning flag: @perceive(..., reasoning=True) or for streaming, perceive(..., reasoning=True, stream=True) (defaults to text outputs).
  • High-level helpers: pass the same flag, e.g. caption(..., reasoning=True), detect(..., reasoning=True), or question(..., reasoning=True).

Streaming usage (Python)

import perceptron
from perceptron import perceive, image, text

perceptron.configure(provider="perceptron", api_key="<your_api_key_here>")

@perceive(model="qwen3-vl-235b-a22b-thinking", stream=True, reasoning=True)
def forklift_status(img):
  return image(img) + text("What is the forklift doing?")

for event in forklift_status("https://example.com/warehouse.webp"):
  if event["type"] == "text.delta":
    print(event["chunk"], end="")
  if event["type"] == "final":
    res = event["result"]
    print("\nanswer:", res.get("text"))
    print("reasoning trace:", res.get("reasoning"))

Non-streaming usage (Python)

import perceptron
from perceptron import perceive, image, text

perceptron.configure(provider="perceptron", api_key="<your_api_key_here>")

@perceive(model="isaac-0.2-2b-preview", reasoning=True)
def forklift_status(img):
  return image(img) + text("What is the forklift doing?")

result = forklift_status("https://example.com/warehouse.webp")
print("answer:", result.text)
print("reasoning trace:", result.reasoning)

Event and result shape

  • Streaming emits text.delta chunks; reasoning appears first wrapped in <think>...</think>, followed by answer text. There is no separate reasoning.delta channel.
  • The final payload is { "type": "final", "result": { ... } } where result["reasoning"] is a list of extracted reasoning strings and result["text"] omits the <think> tags.
  • Structured helpers (detect, ocr, caption, question, etc.) pass through the same reasoning flag and include traces in their final results.

Tips

  • Use low temperature for repeatable traces.
  • Request reasoning only when you need it; thinking models already enable it automatically.
  • Capture the final reasoning list for logging instead of scraping <think> tags yourself.
See the full request/response schema in the Chat Completions API.