Skip to main content
Open In Colab Isaac is a vision-language model that understands images. Ask it questions, detect objects, read text, or get captions — all through a simple API.

Try Isaac in 30 seconds

Create an API key

Get your key from the Perceptron platform
Then pick your preferred method:
curl -X POST "https://api.perceptron.inc/v1/chat/completions" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $PERCEPTRON_API_KEY" \
  -d '{
  "model": "isaac-0.2-2b-preview",
  "messages": [
    {
      "role": "user",
      "content": [
        {"type": "image_url", "image_url": {"url": "https://raw.githubusercontent.com/perceptron-ai-inc/perceptron/main/cookbook/_shared/assets/capabilities/qna/studio_scene.webp"}},
        {"type": "text", "text": "What is in this image?"}
      ]
    }
  ]
}'
Using Python? Install with pip install perceptron or pip install openai
Supported image formats: JPEG, PNG, WebP — pass a URL or local file path. For deterministic outputs, add "temperature": 0.0. See API Reference for all parameters.

Explore our developer guides


Models

ModelBest forSpeed
isaac-0.2-2b-previewGeneral use, reasoning enabledFast
isaac-0.2-1bLow-latency, edge deploymentFastest
isaac-0.1Legacy supportFast
qwen3-vl-235b-a22b-thinkingComplex documents, long contextSlow

isaac-0.2-2b-preview

Best-in-class 2B VLM with reasoning. Sub-200ms time-to-first-token.

isaac-0.2-1b

Compact 1B VLM for edge and low-latency deployments.

isaac-0.1

Original 2B VLM, still supported for existing integrations.

Qwen3VL

Hosted 235B model for complex documents and long context.
  • Model ID: qwen3-vl-235b-a22b-thinking
  • Context: 127K tokens
  • Reasoning: Yes (always on)
  • Pricing: $0.40/M input, $4.00/M output
  • Open weights on Hugging Face

Benchmarks

isaac-0.2 benchmark comparison