Prompting reference - Perceptron Docs

Isaac 0.2 triggers thinking and grounding through <hint>...</hint> system messages. See the API reference for details.

Quick reference

Task	SDK Helper	Optimal Prompt
Concise caption	`caption(style="concise")`	`Provide a concise, human-friendly caption for the upcoming image.`
Detailed caption	`caption(style="detailed")`	`Provide a detailed caption describing key objects, relationships, and context in the upcoming image.`
OCR	`ocr()`	System: `You are an OCR system. Accurately detect, extract, and transcribe all readable text from the image.`
General detection	`detect()`	`Your goal is to segment out the objects in the scene`
Class detection	`detect(classes=[...])`	`Your goal is to segment out the following categories: {categories}`
Visual Q&A	`question()`	Pass your question directly as user content
Grounded Q&A	`question(expects="box")`	Same question, model returns boxes with answers
Counting	`question()`	`How many {objects} are there? Point to each.`

Grounding on Isaac 0.2 (`<hint>` syntax)

Place hint values inside a system-role message. Multiple hints can share one <hint>...</hint> tag, separated by spaces.

Hint	Output
`<hint>BOX</hint>`	Bounding boxes
`<hint>POINT</hint>`	Points
`<hint>POLYGON</hint>`	Polygons
`<hint>THINK</hint>`	Reasoning trace
`<hint>FOCUS</hint>`	Internal focus tool

Example: boxes

curl -X POST "https://api.perceptron.inc/v1/chat/completions" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $PERCEPTRON_API_KEY" \
  -d '{
  "model": "isaac-0.2-2b-preview",
  "messages": [
    { "role": "system", "content": [{"type": "text", "text": "<hint>BOX</hint>"}] },
    { "role": "user",
      "content": [
        {"type": "image_url", "image_url": {"url": "<image-url>"}},
        {"type": "text", "text": "Find all the safety equipment"}
      ]
    }
  ]
}'

Example: reasoning

curl -X POST "https://api.perceptron.inc/v1/chat/completions" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $PERCEPTRON_API_KEY" \
  -d '{
  "model": "isaac-0.2-2b-preview",
  "messages": [
    { "role": "system", "content": [{"type": "text", "text": "<hint>THINK</hint>"}] },
    { "role": "user",
      "content": [
        {"type": "image_url", "image_url": {"url": "<image-url>"}},
        {"type": "text", "text": "Count the number of cars, excluding buses. Explain your reasoning."}
      ]
    }
  ]
}'

Example: counting (boxes + thinking)

curl -X POST "https://api.perceptron.inc/v1/chat/completions" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $PERCEPTRON_API_KEY" \
  -d '{
  "model": "isaac-0.2-2b-preview",
  "messages": [
    { "role": "system", "content": [{"type": "text", "text": "<hint>BOX THINK</hint>"}] },
    { "role": "user",
      "content": [
        {"type": "image_url", "image_url": {"url": "<image-url>"}},
        {"type": "text", "text": "Count the helmets on visible workers and box each one."}
      ]
    }
  ]
}'

Advanced: `@perceive` decorator

The Python SDK wraps the chat-completions endpoint with a typed decorator that handles hint setup and result parsing for you.

With reasoning

from perceptron import configure, perceive, image, text

configure(provider="perceptron", api_key="YOUR_API_KEY")

@perceive(model="isaac-0.2-2b-preview", reasoning=True)
def analyze(photo):
    return image(photo) + text("Identify all the colors in this scene")

result = analyze("scene.jpg")
print(result.reasoning)  # chain-of-thought trace
print(result.text)

With structured output (Pydantic)

from typing import Literal
from pydantic import BaseModel, Field
from perceptron import configure, perceive, pydantic_format, image, text

configure(provider="perceptron", api_key="YOUR_API_KEY")

class SceneAnalysis(BaseModel):
    scene_type: Literal["urban", "nature"]
    main_subjects: list[str] = Field(description="Primary objects in the scene")
    mood: Literal["energetic", "peaceful", "tense"]
    time_of_day: Literal["day", "night", "unknown"]

@perceive(model="isaac-0.2-2b-preview", response_format=pydantic_format(SceneAnalysis))
def analyze_scene(photo):
    return image(photo) + text("Analyze this scene. Output in JSON with scene type, subjects, mood and time of day.")

scene = analyze_scene("photo.jpg")
print(f"Scene type: {scene.scene_type}")
print(f"Subjects: {scene.main_subjects}")
print(f"Mood: {scene.mood}")
print(f"Time: {scene.time_of_day}")

​Quick reference

​Grounding on Isaac 0.2 (<hint> syntax)

​Example: boxes

​Example: reasoning

​Example: counting (boxes + thinking)

​Advanced: @perceive decorator

​With reasoning

​With structured output (Pydantic)

Quick reference

Grounding on Isaac 0.2 (`<hint>` syntax)

Example: boxes

Example: reasoning

Example: counting (boxes + thinking)

Advanced: `@perceive` decorator

With reasoning

With structured output (Pydantic)