Scaling Wasm: The LetX Engineering Handbook
How we ported 5GB of TeX Live to the browser, and why WebAssembly is the future of complex web apps.
We use Temporal.io to make long-running GenAI pipelines durable — retries, timeouts, and state for video generation and multi-step AI workflows.
We use Temporal.io to make long-running GenAI pipelines durable — retries, timeouts, state persistence, and resumability for video generation, multi-step agent workflows, and batch inference jobs. Without Temporal, a 10-step GenAI pipeline that fails at step 8 restarts from the beginning. With Temporal, it resumes from step 8. For pipelines that take 5–30 minutes, this difference is significant.
GenAI pipelines fail in ways that regular APIs don't: LLM API rate limits (429) on the third call in a batch, rendering timeout after 4 minutes of GPU work, tool execution failure mid-agent-run, infrastructure hiccup during a 20-minute video generation. Standard retry logic handles simple failures. Complex multi-step pipelines need workflow-level durability — state checkpointed at each step, resumable from any point.
Quantum Sketch video generation is the primary use case at Shahriar Labs: topic decomposition (LLM) → scene scripting (LLM) → mathematical validation → rendering (5+ minutes) → merging. Each step can fail independently. Temporal makes the pipeline resume from the exact failure point.
Activity retries: Every LLM call is a Temporal Activity with retry policy: maximumAttempts: 5, initialInterval: '2s', backoffCoefficient: 2, maximumInterval: '60s'. Rate limits and transient failures are handled automatically.
Model fallback via activities: An activity that calls Claude Opus first, falls back to Claude Sonnet on failure, then to a free model via openrouter-free. The workflow doesn't know which model was used — it just receives the output.
Long-running activities: For 5+ minute rendering jobs, configure Temporal heartbeat timeouts. The activity pings Temporal every 30s during rendering. If it stops pinging, Temporal marks it failed and schedules a retry on a fresh worker.
Temporal Cloud costs ~$25/month for moderate usage and eliminates operational overhead. Self-hosted Temporal on ECS adds ~2 hours of setup and requires managing Cassandra or PostgreSQL as the persistence store. For production GenAI pipelines that run hundreds of workflows daily, the Cloud cost is worth the operational simplicity. For the deployment infrastructure, see hermes-agent-aws and multi-tenant SaaS on AWS.
Written by Shihab Shahriar Antor — AI Engineer & Founder of Shahriar Labs. Building LetX, QuantumSketch, and open-source AI agent skills.