- LLM streaming in Node/Python8m · 10 blocks
- Function calling with retries8m · 2 blocks
- Vector DB ops from a backend POV8m · 2 blocks
- Cost guardrails per route8m · 2 blocks
- Caching LLM responses safely8m · 2 blocks
- OTel for LLMs: spans + metrics8m · 2 blocks
- Async background jobs for slow prompts8m · 2 blocks
- Multi-provider failover8m · 2 blocks