Skip to content

Performance Optimization

Get the most out of our APIs with these tips.

Batch Requests

Cut latency by sending multiple prompts at once:

data = {"prompts": ["Tell me a story.", "Write a poem."]}
response = requests.post(url, json=data, headers=headers)

Caching

Save time and credits by caching responses:

  • Use Redis for fast, in-memory storage in high-traffic apps.