detect() helper finds grounded objects in an image and returns normalized geometry. Use it to detect or count items in a scene or track objects across a multiple frames.
Basic usage
| Parameter | Type | Default | Description |
|---|---|---|---|
image_path | str | - | Path or URL to the source image (JPG, PNG, WEBP) |
classes | list[str] | [] | Labels to look for; use plural lists for multi-target jobs |
expects | str | "box" | Geometry type for grounded outputs ("box", "point", "polygon") |
reasoning | bool | False | Set True to enable reasoning and include the model’s chain-of-thought |
format | str | "text" | CLI output schema; choose "text" for Rich summaries or "json" for machine-readable results |
The
format argument is only available through the CLI flags (--format text|json). The Python helper always returns a PerceiveResult.PerceiveResult object:
text(str): Model summary for the scenepoints(list): Bounding boxes, points, or polygons that align withexpects; there is no separateresult.boxesattributepoints_to_pixels(width, height): Helper to convert normalized coordinates to pixels
Example: PPE compliance line
In this example, we download a photo of a factory worker, run detection for hard hats and safety vests, and overlay the returned bounding boxes to visualize the grounded output end to end.Isaac returns geometry on a 0–1000 normalized axis. Always convert via
result.points_to_pixels(width, height) before drawing overlays—see the coordinate system guide for more patterns.CLI usage
Run detections straight from the CLI by specifying your source image, target classes, and geometry/output preferences:Best practices
- Targeted prompts: Call out the exact categories you care about (“helmets, vests, goggles”) and set the
classeslist accordingly so Isaac focuses on those objects. - One intent per request: Follow the prompting guide’s advice and keep each detection call focused on a single job (PPE, pallets, anomalies). Separate calls prevent the model from juggling conflicting objectives.
- Grounded exemplars: When objects are subtle, attach additional reference frames (multi-image inputs) or short textual descriptors so the model learns the trait you want detected - see the in-context-learning sections for more examples.
Run through the full Jupyter notebook here. Reach out to Perceptron support if you have questions.