Isaac 0.1 Token Counting
Isaac 0.1 uses a patch-based approach to process images:- Native resolution: Processes images at their original resolution; supports wide range of aspect ratios
- Patch Size: 16×16 pixels
- Pixel Shuffle: 2×2 (4 patches are pooled into a single token)
- Effective Token Size: 32×32 pixels per token
- Token Formula:
⌈width / 32⌉ × ⌈height / 32⌉
Constraints
- Minimum: 256 patches → 64 tokens
- Maximum: 6,144 patches → 1,536 tokens
- Images outside these bounds are automatically resized while maintaining aspect ratio
- Due to the resize algorithm (dimensions must be divisible by 32), the practical maximum is typically around 1,508 tokens for common aspect ratios
Calculation Examples
Example 1: 640×480 (VGA) - No Resize Needed- Round dimensions to nearest multiple of 32: 640×480 (already divisible)
- Calculate patches: (640 ÷ 16) × (480 ÷ 16) = 40 × 30 = 1,200 patches
- Calculate tokens: 1,200 patches ÷ 4 = 300 tokens
- Check constraints: 256 ≤ 1,200 ≤ 6,144 ✓ (no resize needed)
- Cost: 300 × (0.000045
- Calculate original patches: (1920 ÷ 16) × (1080 ÷ 16) = 120 × 68 = 8,160 patches
- Check constraints: 8,160 > 6,144 (exceeds maximum, resize needed)
- Resize to 1664×928 (maintains ~16:9 aspect ratio, divisible by 32)
- Calculate new patches: (1664 ÷ 16) × (928 ÷ 16) = 104 × 58 = 6,032 patches
- Calculate tokens: 6,032 patches ÷ 4 = 1,508 tokens
- Cost: 1,508 × (0.000226
Qwen3VL Token Counting
Qwen3VL uses a similar patch-based approach:- Native resolution: Processes images at their original resolution; supports aspect ratios up to 200:1
- Patch Size: 16×16 pixels
- Spatial Merge Size: 2×2 (merges 2×2 patches into 1 token)
- Effective Token Size: 32×32 pixels per token
- Token Formula:
⌈width / 32⌉ × ⌈height / 32⌉
Constraints
- Default Token Limit: 2,560 tokens
- Images exceeding this limit are automatically resized while maintaining aspect ratio
- Due to the resize algorithm (dimensions must be divisible by 32), the practical maximum is typically around 2,479 tokens for common aspect ratios
Common Image Sizes
Token counts and costs for common image resolutions.Isaac 0.1
Pricing: $0.15 per million input tokens| Resolution | Dimensions | Tokens | Cost (Input) | Per 1K Images |
|---|---|---|---|---|
| 512×512 | 512×512 | 256 | $0.000038 | $0.04 |
| VGA | 640×480 | 300 | $0.000045 | $0.05 |
| HD (720p) | 1280×720 | 920 | $0.000138 | $0.14 |
| 1024×1024 | 1024×1024 | 1,024 | $0.000154 | $0.15 |
| Full HD (1080p) | 1920×1080 | 1,508* | $0.000226 | $0.23 |
| 2K | 2560×1440 | 1,508* | $0.000226 | $0.23 |
| 4K | 3840×2160 | 1,508* | $0.000226 | $0.23 |
| 8K | 7680×4320 | 1,508* | $0.000226 | $0.23 |
*Isaac 0.1 automatically resizes images exceeding 6,144 patches to fit within this limit while maintaining aspect ratio. Due to the resize algorithm (dimensions must be divisible by 32), the practical maximum is 1,508 tokens (6,032 patches at 1664×928 for 16:9 aspect ratio).
Qwen3VL
Pricing: $0.70 per million input tokens| Resolution | Dimensions | Tokens | Cost (Input) | Per 1K Images |
|---|---|---|---|---|
| 512×512 | 512×512 | 256 | $0.000179 | $0.18 |
| VGA | 640×480 | 300 | $0.000210 | $0.21 |
| HD (720p) | 1280×720 | 880 | $0.000616 | $0.62 |
| 1024×1024 | 1024×1024 | 1,024 | $0.000717 | $0.72 |
| Full HD (1080p) | 1920×1080 | 2,040 | $0.001428 | $1.43 |
| 2K | 2560×1440 | 2,479* | $0.001735 | $1.74 |
| 4K | 3840×2160 | 2,479* | $0.001735 | $1.74 |
| 8K | 7680×4320 | 2,479* | $0.001735 | $1.74 |
*Qwen3VL has a default token limit of 2,560 tokens. Images exceeding this limit are automatically resized while maintaining aspect ratio. Due to the resize algorithm (dimensions must be divisible by 32), the practical maximum is 2,479 tokens (2,621,440 pixels at 2144×1184 for 16:9 aspect ratio).
Optimization Guidance
Recommended Resolutions
We recommend passing in the original resolution of the image. If the resolution is greater than the maximum supported, we recommend client-side preprocessing. Lower resolution can erode quality but may improve latency and reduce token counts.Client-Side Preprocessing
You can resize images before sending them to reduce token usage and costs: When to Resize:- Below minimum: If your images are smaller than the minimum token limits (256 patches for Isaac 0.1, 4 tokens for Qwen3VL), resize them yourself to avoid automatic upscaling
- Above maximum: If your images exceed the maximum limits (6,144 patches for Isaac 0.1, 2,560 tokens for Qwen3VL), resize them yourself to maintain control over quality
- Resize to multiples of 32: When resizing, aim for dimensions divisible by 32 (e.g., 1280×720, 1024×1024, 1920×1088) to avoid additional processing overhead
- Maintain aspect ratio: Preserve original proportions to avoid distortion
- Faster uploads: Pre-resized images reduce bandwidth usage