Distributed locks are easy to overtrust because they resemble local mutexes in API shape while behaving very differently under failure.
The important mental model is not “global synchronized.” It is “lease-based coordination under uncertain clocks, pauses, and network conditions.”
That changes what a safe design looks like.
When a Distributed Lock Is Actually Worth It
Reasonable use cases include:
- one scheduler job should run on only one node at a time
- a short critical section protects a shared external resource
- duplicate execution is costly enough that coordination is worth the complexity
Weak use cases include:
- using a global lock instead of partitioning work
- compensating for non-idempotent design where duplication could have been tolerated
- protecting long, uncertain workflows with one coarse lock
If partitioning or idempotency can remove the need for the lock, that is usually a better design.
Redis Locks Are Lease Patterns, Not Ownership Proof
A common Redis shape is:
- store a unique token with
SET NX PX - release only if the same token still owns the key
String token = UUID.randomUUID().toString();
boolean acquired = redis.set(lockKey, token, SetArgs.Builder.nx().px(10_000));
if (!acquired) {
return;
}
try {
doCriticalWork();
} finally {
redis.eval(RELEASE_SCRIPT, List.of(lockKey), List.of(token));
}
The critical detail is compare-and-delete on release. Blind DEL is unsafe because ownership may have changed after lease expiry.
Lease Expiry Is Normal, Not Exceptional
A safe design assumes:
- the holder may pause
- renewal may fail
- another node may acquire the lock after expiry
That means the application must define what happens when ownership is lost mid-work.
If the code simply continues because it “was the owner a moment ago,” the lock is not doing enough for a correctness-sensitive workflow.
Fencing Tokens Matter When Stale Owners Are Dangerous
For stronger correctness, a protected resource should be able to reject stale owners.
That is the role of a fencing token:
- each successful lease acquisition gets a monotonic token
- downstream writes accept only the newest valid token
record Lease(String ownerId, long fencingToken, Instant expiresAt) {}
void write(Lease lease, Command command) {
repository.applyIfTokenAtLeast(lease.fencingToken(), command);
}
This is often more important than the lock primitive itself. Without downstream fencing, a paused old owner may still perform stale work after a new owner has legitimately taken over.
ZooKeeper and Curator Fit Different Problems
Redis is attractive for simpler lease-based coordination. ZooKeeper-style coordination is a better fit when you need:
- stronger session semantics
- richer leader election behavior
- watch-based coordination patterns
The trade-off is more operational complexity.
So the question is not “which one is better?” It is “how much ownership accuracy and coordination depth does this problem need?”
Keep the Critical Section Small
Distributed locks are weakest when asked to guard:
- long-running workflows
- uncertain I/O
- many sequential external calls
The longer the section, the more likely lease expiry, renewal trouble, or stale-owner behavior becomes relevant.
That is why the safe default is:
- keep lock scope tiny
- renew only when necessary
- abort quickly on lost ownership
Warning
If the protected action cannot tolerate a stale owner continuing after lease loss, downstream fencing is usually required. The lock alone is not enough.
A Better Failure Drill
Pause the lock holder long enough for the lease to expire:
- node A acquires the lease
- node A pauses
- node B acquires the new lease
- node A resumes and attempts to continue work
If the resource cannot reject node A’s stale action, the coordination design is not strong enough for the use case.
That drill is much more revealing than simply proving two nodes do not usually enter at the same time.
Key Takeaways
- Distributed locks are lease-based coordination tools, not perfect remote mutexes.
- Lease expiry and stale owners should be treated as normal design conditions.
- Redis is fine for simple lease patterns, but stronger correctness often needs fencing.
- Keep lock scope small and prefer idempotency or partitioning when they can remove the lock entirely.
Categories
Tags