1
Install dependencies and configure the SDK
Install the SDK, OpenCV, Pillow, and tqdm. Export an API key so Create
configure() can pick it up.frame_by_frame.py and add the imports + configuration block:frame_by_frame.py
2
Download the sample video
Grab the shared surfing clip (or point the script at your own MP4).
3
Extract frames (tune the stride)
We sample one JPG every Tuning tip: decrease
stride frames so long clips stay manageable.stride for smoother playback, increase it when you just need periodic samples.4
Detect surfers in every frame
The
@perceive helper wraps Isaac 0.1 so we can send each frame plus a natural-language instruction. The loop captures the raw answer, counts boxes, converts them to pixel coordinates, and draws overlays.result.points_to_pixels()keeps the normalized → pixel conversion consistent.- The
all_detectionslist becomes a quick audit trail (counts + captions per frame).
5
Stitch annotated frames back to MP4
OpenCV writes the annotated JPGs back into a video using the original resolution and your preferred FPS.
6
Run the pipeline
Execute the script end-to-end and inspect the outputs. You should see:
frames/andframes_annotated/populated with numbered JPGs.surf_annotated.mp4in the project root.