← /blog
2025.11.02·12 min·architecture

Why I stopped reaching for distributed transactions

Distributed transactions are one of those things that sound correct in a design review and fall apart in production. I've implemented two-phase commit, I've used saga patterns, and I've watched both cause more problems than they solved. Here's what I learned.

The appeal

The promise is seductive: update multiple services atomically, as if they were one database. Money moves from A to B, inventory decrements, an email is queued — all or nothing. Clean.

In practice, "all or nothing" becomes "all or nothing, and if the network hiccups during phase two, nothing, and now you have a locked resource that no one can touch until the coordinator times out."

Two-phase commit: the honest assessment

2PC works. It genuinely does — when your network is reliable, your services are fast, and nothing ever crashes. In other words, in conditions that don't exist in production.

-- coordinator
PREPARE TRANSACTION 'tx_123';
-- participant 1
PREPARE TRANSACTION 'tx_123';
-- participant 2
PREPARE TRANSACTION 'tx_123';
-- if all prepared:
COMMIT PREPARED 'tx_123';

The problem isn't the happy path. The problem is what happens when participant 2 takes 30 seconds to prepare because it's under load, and participant 1 is holding locks the entire time, and the coordinator is a single point of failure sitting in between.

What I do instead

Idempotency over atomicity. Design each step to be safely retryable. Store enough state that you can replay any operation without side effects. This is more work upfront but dramatically simpler to operate.

Eventual consistency with compensating actions. If step 3 of 5 fails, don't try to roll back steps 1 and 2 atomically. Instead, emit a compensating event that a downstream consumer handles asynchronously. It's not instant, but it's reliable.

Outbox pattern. Write your state change and the event you want to publish in the same database transaction. A separate process polls the outbox table and publishes events. No distributed lock needed.

-- single local transaction
BEGIN;
UPDATE accounts SET balance = balance - 100 WHERE id = 'a';
INSERT INTO outbox (topic, key, payload) VALUES
  ('payment.processed', 'a', '{"amount": 100}');
COMMIT;

-- separate poller picks up outbox rows and publishes

When distributed transactions are actually fine

If you have two postgres databases and you need cross-database consistency, postgres_fdw with COMMIT PREPARED works fine for low-throughput workloads. If you're moving money between accounts in the same database, a plain BEGIN/COMMIT is all you need — no distributed anything required.

The rule of thumb: if you can fit your consistency boundary in one database, do it. If you can't, use the outbox pattern and accept eventual consistency. Distributed transactions are a last resort, not a default.

The best distributed transaction is the one you don't need.

← all poststhanks for reading