Durable Outboxes
I have been fortunate enough to be extensively working with Cloudflare Workers as part of my day job as of late. Durable Objects in particular have become an indespensible part of my toolbox. For the uninitiated, Durable Objects extend the serverless compute primitive (think Lambda) to include an SQLite database to enable state persistence. Each Durable Object is closer to a live object instance than a memory-mapped class; state changes have to be persisted explicitly. But they do allow you to build complex, stateful, serverless applications while mostly forgetting about devops.
The Canonical Example
Kenton Varda, introducing the SQLite storage backend for Durable Objects in late 2024, detailed how an airline could manage seat assignment for a flight:
;
// All Workers that specify the same object ID (probably based on the flight
// number and date) will reach the same instance of FlightSeating.
Each flight has its own object, and therefore database, instantiated on-demand. Once the hypthothetical plane takes to the skies, the object can safely be deleted, ideally serialized to some long term persistant store. Inside that object, the rest reads like ordinary SQL:
// Assign passenger to a seat.
seatId, occupant
Look at the comment in assignSeat. There are no awaits between the read and the write, and the database is private to this Durable Object, so nothing else can grab the seat between checks. The race that elsewhere demands SELECT FOR UPDATE or optimistic locking isn't there.
A Quick Performance Detour
SQLite inside a Durable Object runs in-process. this.sql.exec() is a function call, not a network round trip; per-query overhead is in the microseconds.
This is what makes the N+1 pattern affordable. With a database 5ms away over the wire, fetching 100 parent rows and then their children one at a time costs 505ms. In a Durable Object, the same loop is bounded by the queries themselves, performing on par with an equivalent JOIN.
Playing Nice With Others
It is an inevitability that when developing any web app you will have to incorporate another service's API. Oftentimes you will find yourself at the mercy of an outage occurring at said service provider. Naive fetches go unanswered, throwing errors that look like your fault, or worse, allowing your database and the external API to quietly fall out of sync. This is especially important for state-changing APIs like payments, SMS, and transactional email.
Take Azure Communication Services. You hand it your email and, rather than a clean send-and-confirm, you get back an operation handle:
;
// 202 Accepted, with the verdict to be delivered separately.
;
The email hasn't actually gone out yet. To find out how it ended, you come back later and poll:
;
;
// "Running"... try again in a moment.
// "Succeeded"... finally.
// "Failed"... now what?
Each of these calls is its own opportunity to fail. The submit might 429. The poll might 5xx. The operation itself can sit in Running for an embarrassingly long time. By the time you know how it ended, the user has long since closed the tab; your application has to keep track for them.
Sound The Alarm
The outbox is a Durable Object with a SQLite queue and an alarm. Each row is a pending or in-flight item:
(
id INTEGER PRIMARY KEY AUTOINCREMENT,
payload TEXT NOT NULL,
status TEXT NOT NULL DEFAULT 'queued', -- 'queued' | 'sending' | 'sent' | 'failed'
retries INTEGER NOT NULL DEFAULT 0,
operationUrl TEXT,
error TEXT
);
Two entry points. enqueue adds rows and arms the alarm to fire immediately. alarm drains the queue and schedules itself again if there's still work.
async payload: object: Promise<void>
async : Promise<void>
The queue row is a small state machine, moving from 'queued' to 'sending' to 'sent' or 'failed' as the passes advance it. Errors don't drop a row from the queue; they bump the retry counter and leave it for the next tick.
private async : Promise<void> catch }
}
private async : Promise<void> else if result.status === "Failed" }
}
The alarm doubles as the rate limit. The same setAlarm(now + 10_000) that schedules the next tick also prevents the outbox from hitting the external API again for ten seconds. There is no separate token bucket or sleep loop.
At fifteen sends and twenty polls per ten-second tick, the outbox processes ninety items per minute. Tune for your external API's rate limit.
My first instinct was Queues. It does not fit here. Queues processes each event once. The outbox processes the same row across many ticks.
Enter: WebSockets
The user does not need to poll your application to see progress. A WebSocket can push updates as they happen.
Cloudflare provides hibernatable WebSockets for this. Calling ctx.acceptWebSocket() rather than socket.accept() hands the connection back to the platform; the Durable Object can be evicted while the socket stays connected, and only wakes when there is something to send. The connection survives idle time at no compute cost.
async request: Request: Promise<Response>
I chose Turbo Stream fragments rather than JSON. Turbo will not be everyone's pick. Sending rendered HTML and letting the browser apply it means there is no client-side state to manage. The server broadcasts what should change, and it changes.
private html: string: void
Every mutation that touches queue state can also call broadcast with the appropriate fragment. The Durable Object wakes when the alarm fires, runs, broadcasts, and is evicted again. Connected browsers update without ever sending a request.