Video Q&A

The question() helper accepts a video() node alongside a natural-language prompt and returns a textual answer. Combine with reasoning=True for step-by-step analysis of long-horizon episodes — assembly walkthroughs, training videos, customer-session recordings, and other content where the answer depends on watching what happens over time.

Basic usage

from perceptron import question, video

result = question(
    video(video_path),         # str: Local path or URL to MP4 or WebM
    "What happens in this video?",  # str: Natural-language question
    reasoning=True,            # bool: enable reasoning and include chain-of-thought
)

print(result.reasoning)  # Model's chain-of-thought when reasoning=True
print(result.text)       # The answer

Parameters:

Parameter	Type	Default	Description
`media_obj`	`VideoNode`	-	Wrap your MP4 or WebM (URL or local file path) with `video()`
`question_text`	`str`	-	The question to ask about the video
`reasoning`	`bool`	`False`	Set `True` to enable reasoning and include the model’s chain-of-thought
`expects`	`str`	`"text"`	Output structure for the SDK (`"text"`, `"clip"`, `"point"`, `"box"`, `"polygon"`)

Returns: PerceiveResult object:

text (str): The answer to your question.
reasoning (str | None): The model’s chain-of-thought when reasoning=True.
clips, points, boxes, polygons (list | None): Populated when the corresponding expects is requested.

Example: Robot assembly walkthrough

In this example we download a short robot-assembly clip, ask Isaac to identify the overall goal and the sub-goals it observes, and let it think through the episode before answering.

from pathlib import Path
from urllib.request import urlretrieve

from perceptron import configure, question, video

configure(
    provider="perceptron",
    model="isaac-0.3-max",
    api_key="YOUR_API_KEY",
)

# Download reference video
VIDEO_URL = "https://raw.githubusercontent.com/perceptron-ai-inc/perceptron/main/cookbook/_shared/assets/capabilities/video-qa/robot_assembly.mp4"
VIDEO_PATH = Path("robot_assembly.mp4")

if not VIDEO_PATH.exists():
    urlretrieve(VIDEO_URL, VIDEO_PATH)

# Ask the question with reasoning enabled
result = question(
    video(str(VIDEO_PATH)),
    "Watch this episode and state the overall goal and sub-goals.",
    reasoning=True,
)

print("--- Reasoning ---")
print(result.reasoning or "(none)")
print("\n--- Answer ---")
print(result.text)

Best practices

Reach for expects="clip" when you need timestamps: If the answer needs to point at when something happens in the video, switch to the Video Clipping workflow instead.

Run through the full Jupyter notebook here. Reach out to Perceptron support if you have questions.

Get Started

Capabilities

Developer Guides

Scaling & deployment

Best practices

Basic usage

Example: Robot assembly walkthrough

Best practices

Get Started

Capabilities

Developer Guides

Scaling & deployment

Best practices

Documentation Index

​Basic usage

​Example: Robot assembly walkthrough

​Best practices

Basic usage

Example: Robot assembly walkthrough

Best practices