Skip to main content

Overview

Focus gives Isaac a “zoom-and-recall” tool call. When enabled, the model can crop into interesting regions of the image, call itself on those zooms, and fuse the extra context back into the final answer. It effectively allocates more compute to the most relevant patches—great for tiny details, dense scenes, or hard-to-spot objects.

When to use

  • You need pixel-level scrutiny (serial numbers, fine print, tiny defects).
  • Scenes are busy and you want the model to search exhaustively without hand-crafted prompts.
  • Multi-object descriptions where localized views improve recall (e.g., shelf audits, parts lists).

How it works

  • Focus is modeled as a tool call - supported models (isaac-0.2-2b-preview and isaac-0.2-1b) can choose to invoke focus.
  • The model selects regions, re-runs itself on cropped views, and combines those findings with the original context.

Usage (Python SDK)

import perceptron
from perceptron import perceive, image, text

perceptron.configure(provider="perceptron", api_key="<your_api_key_here>")

# Let Isaac zoom into the photo to recover fine details.
@perceive(model="isaac-0.2-2b-preview", focus=True)
def describe_tools(photo):
  return image(photo) + text("List every handheld tool you see with colors.")

result = describe_tools("https://example.com/garage.png")
print("answer:", result.text)

Tips

  • Keep prompts concise; Focus decides where to zoom and how many views to take.
  • Use low temperature for consistent zoom choices when automating QA checks.
  • Combine with structured expects (expects="box" or "point") if you need precise localization alongside detailed text.