This service enforces multiple layers of rate limiting to keep the API stable while still allowing legitimate bursts of traffic.
Global Limits
- GET endpoints: 300 requests per minute per application.
- Write/Delete endpoints (POST, PATCH & DELETE): 300 requests per minute per application. This generic ceiling applies after the endpoint-specific limits listed below, ensuring any new write route has a default guardrail.
- 429 responses include the following headers so clients can self-throttle:
X-RateLimit-LimitX-RateLimit-RemainingX-RateLimit-Reset(epoch seconds)Retry-After
Need something higher? Reach out to support with your use case and expected burst size so we can review an override.
Endpoint-Specific Limits
The middleware defines additional, more restrictive scopes for high-impact operations:
| Scope | HTTP Method(s) | Limit | Description |
|---|---|---|---|
session-v2-create:free | POST /v2/session/ | 5 rpm | Free (zero-credit) workflows share a tighter quota to protect abuse. |
session-v2-create:paid | POST /v2/session/ | 600 rpm | Paid workflows can burst higher for production integrations. |
session-decision | GET /v2/session/<id>/decision/, /session/<id>/decision/ | 100 rpm | Session decision retrieval is throttled to prevent excessive polling. |
session-generate-pdf | GET /session/<id>/generate-pdf/ | 100 rpm | PDF generation is limited due to CPU-bound rendering costs. |
Client Guidance
- Watch the rate-limit headers and begin throttling when
X-RateLimit-Remainingdrops under 15% ofX-RateLimit-Limit. - Implement exponential backoff for 429s (e.g., 5s → 10s → 20s → 40s).
- Log or alert when retries are triggered so your team can investigate sustained bursts.
