Rate Limiting

This service enforces multiple layers of rate limiting to keep the API stable while still allowing legitimate bursts of traffic.

Global Limits

  • GET endpoints: 300 requests per minute per application.
  • Write/Delete endpoints (POST, PATCH & DELETE): 300 requests per minute per application. This generic ceiling applies after the endpoint-specific limits listed below, ensuring any new write route has a default guardrail.
  • 429 responses include the following headers so clients can self-throttle:
    • X-RateLimit-Limit
    • X-RateLimit-Remaining
    • X-RateLimit-Reset (epoch seconds)
    • Retry-After

Need something higher? Reach out to support with your use case and expected burst size so we can review an override.

Endpoint-Specific Limits

The middleware defines additional, more restrictive scopes for high-impact operations:

ScopeHTTP Method(s)LimitDescription
session-v2-create:freePOST /v2/session/5 rpmFree (zero-credit) workflows share a tighter quota to protect abuse.
session-v2-create:paidPOST /v2/session/600 rpmPaid workflows can burst higher for production integrations.
session-decisionGET /v2/session/<id>/decision/, /session/<id>/decision/100 rpmSession decision retrieval is throttled to prevent excessive polling.
session-generate-pdfGET /session/<id>/generate-pdf/100 rpmPDF generation is limited due to CPU-bound rendering costs.

Client Guidance

  • Watch the rate-limit headers and begin throttling when X-RateLimit-Remaining drops under 15% of X-RateLimit-Limit.
  • Implement exponential backoff for 429s (e.g., 5s → 10s → 20s → 40s).
  • Log or alert when retries are triggered so your team can investigate sustained bursts.