How It Works

Compare-and-Swap (CAS)

osqueue uses CAS semantics for all writes to the queue state file. Every object storage backend provides a version token:

S3: ETags (MD5 hash of the object)
GCS: Generation numbers (monotonically increasing integers)
Memory: Incremental version counter

The write flow:

Read queue.json → get data + version token
Apply mutations to the state in memory
Write the modified state back, passing the expected version
If the version matches, the write succeeds and returns a new version
If another writer changed the file, the write fails with CASConflictError
On conflict, re-read the file and retry

This is optimistic concurrency: writes succeed without locking as long as there's no contention. When there is contention, the retry loop ensures eventual consistency.

Group Commit Engine

The GroupCommitEngine is the core of the broker. It batches multiple mutations into a single CAS write to reduce storage API calls.

Time ──────────────────────────────────────────────────▶

  submitJob("email:send")  ──┐
  claimJob(worker1)         ──┼── Batch ──▶ CAS Write ──▶ Resolve all
  heartbeat(job1, worker2)  ──┘
                                           50ms interval
  completeJob(job2, worker1) ──┐
  submitJob("report")        ──┼── Batch ──▶ CAS Write ──▶ Resolve all
                               ┘

How Batching Works

Callers submit mutations via engine.submit(mutation) which returns a Promise
Mutations accumulate in a buffer
When a mutation arrives to an empty buffer, the write loop runs immediately (zero delay). When the buffer is empty, it polls every intervalMs (default: 50ms)
All buffered mutations are applied sequentially to the cached state (state transitions are immutable — each returns a new object)
Heartbeat expiry runs on every write pass (cleaning up stale jobs)
The modified state is written via CAS
On success, all Promises in the batch resolve
On CAS conflict, the engine re-reads state and retries (up to 5 times with backoff)

Configuration

Option	Default	Description
`intervalMs`	`50`	Write loop interval in milliseconds
`heartbeatTimeoutMs`	`30000`	Time before an in-progress job is considered stale
`conflictBackoffMs`	`50`	Base backoff after CAS conflict (multiplied by `attempt + 1`, so 50ms, 100ms, 150ms...)
`maxRetries`	`5`	Maximum CAS retry attempts per batch

Throughput vs Latency

Low interval (e.g., 50ms): Lower latency per mutation, more storage API calls
High interval (e.g., 2000ms): Higher latency, but far fewer API calls (important for cost)

For production with S3, a 2-second interval batches many mutations per write, keeping costs under $1/month. See Configuration for tuning.

Compare-and-Swap (CAS)​

Group Commit Engine​

How Batching Works​

Configuration​

Throughput vs Latency​

Compare-and-Swap (CAS)

Group Commit Engine

How Batching Works

Configuration

Throughput vs Latency