Skip to main content
POST
/
v1
/
detect
cURL
curl --request POST \
  --url https://api.perceptron.inc/v1/detect \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "media": {
    "image_url": "<string>",
    "type": "image"
  },
  "categories": [
    "<string>"
  ],
  "config": {
    "enable_thinking": false
  },
  "exemplars": [
    {
      "annotations": [
        {
          "mention": "<string>",
          "index": 123,
          "label": "<string>",
          "point": {
            "x": 123,
            "y": 123
          },
          "point_box": {
            "bottom_right": {
              "x": 123,
              "y": 123
            },
            "top_left": {
              "x": 123,
              "y": 123
            }
          },
          "timestamp": 123
        }
      ],
      "media": {
        "image_url": "<string>",
        "type": "image"
      }
    }
  ],
  "negative_exemplars": [
    {
      "media": {
        "image_url": "<string>",
        "type": "image"
      }
    }
  ]
}
'
{
  "detections": [
    {
      "mention": "<string>",
      "point": {
        "x": 123,
        "y": 123
      },
      "point_box": {
        "bottom_right": {
          "x": 123,
          "y": 123
        },
        "top_left": {
          "x": 123,
          "y": 123
        }
      }
    }
  ]
}

Overview

The Detect API returns grounded object detections for a single image. Use it when you want a direct detection workflow without building a chat-completions prompt. Requests can include text categories, annotated positive exemplars, both together, or no targets for exhaustive object detection. You can also include negative_exemplars, but only alongside at least one category or positive exemplar. Coordinates for exemplar annotations and returned detections are image pixels.

Examples

Detect all objects as boxes

Omit categories and exemplars to ask the model to return the objects it finds in the image. This is useful for exploratory labeling or scene inventory. Box output is the default.
curl https://api.perceptron.inc/v1/detect \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $PERCEPTRON_API_KEY" \
  -d '{
    "media": {
      "type": "image",
      "image_url": "https://example.com/warehouse.jpg"
    },
    "config": {
      "output": "box",
      "enable_thinking": false
    }
  }'
Box output example showing detected packages in a delivery van

Detect target categories as boxes

Pass categories when you know the labels you want returned. Each detection mention is one of the requested categories.
curl https://api.perceptron.inc/v1/detect \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $PERCEPTRON_API_KEY" \
  -d '{
    "media": {
      "type": "image",
      "image_url": "https://example.com/warehouse.jpg"
    },
    "categories": ["person", "hard_hat"],
    "config": {
      "output": "box",
      "enable_thinking": false
    }
  }'

Detect one category as points

Set config.output to "point" when you want one pixel point per detection. With exactly one unique category and no exemplars, /v1/detect uses a direct point prompt and forces every returned mention to that category.
curl https://api.perceptron.inc/v1/detect \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $PERCEPTRON_API_KEY" \
  -d '{
    "media": {
      "type": "image",
      "image_url": "https://example.com/warehouse.jpg"
    },
    "categories": ["forklift"],
    "config": {
      "output": "point",
      "enable_thinking": false
    }
  }'

Detect all objects as points

You can also request point output without categories or exemplars. The route still uses exhaustive object detection internally, then returns the center point for each detected box or polygon.
curl https://api.perceptron.inc/v1/detect \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $PERCEPTRON_API_KEY" \
  -d '{
    "media": {
      "type": "image",
      "image_url": "https://example.com/warehouse.jpg"
    },
    "config": {
      "output": "point"
    }
  }'
Point output example showing centroid points for detected packages

Detect from exemplars as boxes

Use exemplars when the target is visual or domain-specific. Exemplar annotations use pixel coordinates in the exemplar image. If you omit categories, the exemplar annotation mention values define the target labels.
curl https://api.perceptron.inc/v1/detect \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $PERCEPTRON_API_KEY" \
  -d '{
    "media": {
      "type": "image",
      "image_url": "https://example.com/loading-dock.jpg"
    },
    "exemplars": [
      {
        "media": {
          "type": "image",
          "image_url": "data:image/png;base64,..."
        },
        "annotations": [
          {
            "mention": "damaged_box",
            "point_box": {
              "top_left": { "x": 124, "y": 80 },
              "bottom_right": { "x": 310, "y": 240 }
            }
          }
        ]
      }
    ],
    "config": {
      "output": "box",
      "enable_thinking": true
    }
  }'

Detect from exemplars as points

For exemplar-guided point output, set config.output to "point". Positive exemplars can use either point or point_box annotations. The route uses the exemplar labels to find matching objects, then returns a point for each detection.
curl https://api.perceptron.inc/v1/detect \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $PERCEPTRON_API_KEY" \
  -d '{
    "media": {
      "type": "image",
      "image_url": "https://example.com/loading-dock.jpg"
    },
    "categories": ["damaged_box"],
    "exemplars": [
      {
        "media": {
          "type": "image",
          "image_url": "data:image/png;base64,..."
        },
        "annotations": [
          {
            "mention": "damaged_box",
            "point": { "x": 218, "y": 160 }
          },
          {
            "mention": "damaged_box",
            "point_box": {
              "top_left": { "x": 124, "y": 80 },
              "bottom_right": { "x": 310, "y": 240 }
            }
          }
        ]
      }
    ],
    "config": {
      "output": "point",
      "enable_thinking": true
    }
  }'
enable_thinking is optional and defaults to false. Set it to true when you want a supported detect model to use reasoning internally; the native response still returns only detections and finish_reason.

Supported permutations

Request shapeconfig.outputBehavior
media only"box" or omittedExhaustive object detection; returns point_box for every parsed object.
media only"point"Exhaustive object detection; returns point values converted from parsed boxes or polygons.
One or more categories entries, no positive exemplars"box"Category detection; returns point_box values. Duplicate categories are deduplicated for prompting.
One unique categories entry, no positive or negative exemplars"point"Direct point prompt; every returned mention is forced to that category.
Multiple unique categories entries"point"Category detection; returns centroid point values from parsed boxes or polygons.
Positive exemplars, with or without categories"box"Exemplar-guided detection; returns point_box values. If categories is omitted, exemplar mention values define the target labels.
Positive exemplars, with or without categories"point"Exemplar-guided detection; returns point values. Exemplar annotations may use point or point_box.
negative_exemplars with categories or positive exemplars"box" or "point"Uses negative images as examples of what not to detect. Negative exemplars do not define labels.
negative_exemplars onlyAnyInvalid request; include at least one category or positive exemplar.

Example response

Box response

{
  "detections": [
    {
      "mention": "hard_hat",
      "point_box": {
        "top_left": { "x": 384.0, "y": 108.0 },
        "bottom_right": { "x": 768.0, "y": 540.0 }
      }
    }
  ],
  "finish_reason": "complete"
}

Point response

{
  "detections": [
    {
      "mention": "forklift",
      "point": { "x": 512.0, "y": 320.0 }
    }
  ],
  "finish_reason": "complete"
}

Notes

  • /v1/detect supports image inputs only.
  • config.output can be "box" or "point"; the default is "box". The public value is "box", not "bbox".
  • The response does not include confidence scores.
  • finish_reason is "complete" or "max_tokens".
  • When categories is set, every positive exemplar annotation mention must exactly match one of those categories.
  • Exemplar annotations must include exactly one of point or point_box.
  • Point outputs may come from parsed points directly, or from the centroid of a parsed box or polygon.
  • config.enable_thinking can be true or false for box output, point output, category requests, and exemplar-guided requests. It does not change the response schema.

Limits

LimitValue
categories entries256
Positive exemplars64
negative_exemplars64
Annotations per exemplar256
See the API reference below for the complete request and response schemas.

Authorizations

Authorization
string
header
required

Bearer token authentication using your Perceptron API key

Body

application/json
media
object
required
categories
string[] | null

Category names to detect. Entries must be non-empty and must not contain commas.

config
object
exemplars
object[] | null
negative_exemplars
object[] | null

Response

Detection completed successfully.

detections
object[]
required
finish_reason
enum<string>
required
Available options:
complete,
max_tokens