In our first simple version of a lock, we’ll take note of a few different potential failure
scenarios. When we actually start building the lock, we won’t handle all of the failures
right away. We’ll instead try to get the basic acquire, operate, and release process
working right. After we have that working and have demonstrated how using locks can
actually improve performance, we’ll address any failure scenarios that we haven’t
already addressed.
While using a lock, sometimes clients can fail to release a lock for one reason or
another. To protect against failure where our clients may crash and leave a lock in the
acquired state, we’ll eventually add a timeout, which causes the lock to be released
automatically if the process that has the lock doesn’t finish within the given time.
Many users of Redis already know about locks, locking, and lock timeouts. But
sadly, many implementations of locks in Redis are only mostly correct. The problem
with mostly correct locks is that they’ll fail in ways that we don’t expect, precisely when
we don’t expect them to fail. Here are some situations that can lead to incorrect
behavior, and in what ways the behavior is incorrect:
Even if each of these problems had a one-in-a-million chance of occurring, because
Redis can perform 100,000 operations per second on recent hardware (and up
to 225,000 operations per second on high-end hardware), those problems can come
up when under heavy load,1 so it’s important to get locking right.