Object detection

The detect() helper finds grounded objects in an image and returns normalized geometry. Use it to detect or count items in a scene or track objects across a multiple frames.

Basic usage

from perceptron import detect

result = detect(
    image_path,           # str: Local path or URL to image
    classes=["helmet"],   # list[str]: Categories you expect in frame
    expects="box",        # str: "box" | "point" | "polygon"
    reasoning=True        # bool: enable reasoning and include chain-of-thought (when supported)
)

# Access detections (boxes, points, or polygons based on `expects`)
for annotation in result.points or []:
    print(annotation.mention, annotation)

Parameters:

Parameter	Type	Default	Description
`image_path`	`str`	-	Path or URL to the source image (JPG, PNG, WEBP)
`classes`	`list[str]`	`[]`	Labels to look for; use plural lists for multi-target jobs
`expects`	`str`	`"box"`	Geometry type for grounded outputs (`"box"`, `"point"`, `"polygon"`)
`reasoning`	`bool`	`False`	Set `True` to enable reasoning and include the model’s chain-of-thought
`format`	`str`	`"text"`	CLI output schema; choose `"text"` for Rich summaries or `"json"` for machine-readable results

The format argument is only available through the CLI flags (--format text|json). The Python helper always returns a PerceiveResult.

Returns: PerceiveResult object:

text (str): Model summary for the scene
points (list): Bounding boxes, points, or polygons that align with expects; there is no separate result.boxes attribute
points_to_pixels(width, height): Helper to convert normalized coordinates to pixels

Example: PPE compliance line

In this example, we download a photo of a factory worker, run detection for hard hats and safety vests, and overlay the returned bounding boxes to visualize the grounded output end to end.

import os
from pathlib import Path
from urllib.request import urlretrieve

from perceptron import configure, detect
from PIL import Image, ImageDraw

# Configure API key
configure(
    provider="perceptron",
    api_key=os.getenv("PERCEPTRON_API_KEY", "<your_api_key_here>"),
)

# Download reference frame
IMAGE_URL = "https://raw.githubusercontent.com/perceptron-ai-inc/perceptron/main/cookbook/_shared/assets/capabilities/detection/ppe_line.webp"
IMAGE_PATH = Path("ppe_line.webp")
ANNOTATED_PATH = Path("ppe_line_annotated.png")

if not IMAGE_PATH.exists():
    urlretrieve(IMAGE_URL, IMAGE_PATH)

# Detect PPE
result = detect(
    image_path=str(IMAGE_PATH),
    classes=["helmet", "vest"],
    expects="box",
)

print(result.text)
print(f"Detections: {len(result.points or [])}")

# Draw detections
img = Image.open(IMAGE_PATH).convert("RGB")
draw = ImageDraw.Draw(img)
pixel_boxes = result.points_to_pixels(width=img.width, height=img.height) or []

for box in pixel_boxes:
    draw.rectangle(
        [
            int(box.top_left.x),
            int(box.top_left.y),
            int(box.bottom_right.x),
            int(box.bottom_right.y),
        ],
        outline="lime",
        width=3,
    )
    label = box.mention or "target"
    confidence = getattr(box, "confidence", None)
    if confidence is not None:
        label = f"{label} ({confidence:.2f})"
    draw.text((int(scaled.top_left.x), max(int(scaled.top_left.y) - 18, 0)), label, fill="lime")

img.save(ANNOTATED_PATH)
print(f"Saved annotated frame to {ANNOTATED_PATH}")

Isaac returns geometry on a 0–1000 normalized axis. Always convert via result.points_to_pixels(width, height) before drawing overlays—see the coordinate system guide for more patterns.

CLI usage

Run detections straight from the CLI by specifying your source image, target classes, and geometry/output preferences:

perceptron detect <image_path> [--class class] [--expects box|point|polygon] [--format text|json]

Examples:

# Basic detection
perceptron detect ppe_line.webp --class helmet --expects box

# Multiple classes + JSON output
perceptron detect ppe_line.webp --class helmet --class vest --expects box --format json

Best practices

Targeted prompts: Call out the exact categories you care about (“helmets, vests, goggles”) and set the classes list accordingly so Isaac focuses on those objects.
One intent per request: Follow the prompting guide’s advice and keep each detection call focused on a single job (PPE, pallets, anomalies). Separate calls prevent the model from juggling conflicting objectives.
Grounded exemplars: When objects are subtle, attach additional reference frames (multi-image inputs) or short textual descriptors so the model learns the trait you want detected - see the in-context-learning sections for more examples.

Run through the full Jupyter notebook here. Reach out to Perceptron support if you have questions.

Get Started

Capabilities

Developer Guides

Scaling & deployment

Best practices

Basic usage

Example: PPE compliance line

CLI usage

Best practices

Get Started

Capabilities

Developer Guides

Scaling & deployment

Best practices

​Basic usage

​Example: PPE compliance line

​CLI usage

​Best practices

Basic usage

Example: PPE compliance line

CLI usage

Best practices