6.2.2 Simple locks
In our first simple version of a lock, we’ll take note of a few different potential failure
scenarios. When we actually start building the lock, we won’t handle all of the failures
right away. We’ll instead try to get the basic acquire, operate, and release process
working right. After we have that working and have demonstrated how using locks can
actually improve performance, we’ll address any failure scenarios that we haven’t
While using a lock, sometimes clients can fail to release a lock for one reason or
another. To protect against failure where our clients may crash and leave a lock in the
acquired state, we’ll eventually add a timeout, which causes the lock to be released
automatically if the process that has the lock doesn’t finish within the given time.
Many users of Redis already know about locks, locking, and lock timeouts. But
sadly, many implementations of locks in Redis are only mostly correct. The problem
with mostly correct locks is that they’ll fail in ways that we don’t expect, precisely when
we don’t expect them to fail. Here are some situations that can lead to incorrect
behavior, and in what ways the behavior is incorrect:
- A process acquired a lock, operated on data, but took too long, and the lock was automatically released. The process doesn’t know that it lost the lock, or may even release the lock that some other process has since acquired.
- A process acquired a lock for an operation that takes a long time and crashed. Other processes that want the lock don’t know what process had the lock, so can’t
detect that the process failed, and waste time waiting for the lock to be released.
- One process had a lock, but it timed out. Other processes try to acquire the lock simultaneously, and multiple processes are able to get the lock.
- Because of a combination of the first and third scenarios, many processes now hold the lock and all believe that they are the only holders.
Even if each of these problems had a one-in-a-million chance of occurring, because
Redis can perform 100,000 operations per second on recent hardware (and up
to 225,000 operations per second on high-end hardware), those problems can come
up when under heavy load,1 so it’s important to get locking right.