Scale performance and cost
Optimize Isaac for your SLA—whether you need sub-100ms latency or high throughput.Rate limits
| Endpoint | Limit |
|---|---|
| Chat completions | 300 requests/min |
| Models | 30 requests/min |
| Media upload URL | 150 requests/min |
| Media download URL | 150 requests/min |
Need higher limits? Contact [email protected].
Low latency pipelines
Use tight timeouts andconfig() scopes so interactive paths fail fast, stream minimal tokens, and never block your UI.
Parallel inference lanes
For bulk jobs, fan out API calls with a small worker pool and a shared runner so each task just swaps in the frame path.Throughput guardrails
HandleRateLimitError (429) by backing off and retrying. Use the Retry-After response header to determine when to retry—it returns a delay in seconds (e.g., Retry-After: 120). Add jitter to avoid thundering herd. Keep max_tokens tight so requests finish quickly. Resize images client-side to stay under the 20 MB request limit.