6.2.1 Why locks are important
In the first version of our autocomplete, we added and removed items from a LIST. We
did so by wrapping our multiple calls with a MULTI/EXEC pair. Looking all the way back
to section 4.6, we first introduced WATCH/MULTI/EXEC transactions in the context of an
in-game item marketplace. If you remember, the market is structured as a single ZSET,
with members being an object and owner ID concatenated, along with the item price
as the score. Each user has their own HASH, with columns for user name, currently
available funds, and other associated information. Figure 6.2 shows an example of the
marketplace, user inventories, and user information.
You remember that to add an item to the marketplace, we WATCH the seller’s inventory
to make sure the item is still available, add the item to the market ZSET, and remove it from the user’s inventory. The core of our earlier list_item() function
from section 4.4.2 is shown next.
The short comments in this code just hide a lot of the setup and WATCH/MULTI/EXEC
handling that hide the core of what we’re doing, which is why I omitted it here. If you
feel like you need to see that code again, feel free to jump back to section 4.4.2 to
refresh your memory.
Now, to review our purchasing of an item, we WATCH the market and the buyer’s
HASH. After fetching the buyer’s total funds and the price of the item, we verify that the
buyer has enough money. If the buyer has enough money, we transfer money between
the accounts, add the item to the buyer’s inventory, and remove the item from the
market. If the buyer doesn’t have enough money, we cancel the transaction. If a WATCH
error is caused by someone else writing to the market ZSET or the buyer HASH changing,
we retry. The following listing shows the core of our earlier purchase_item()
function from section 4.4.3.
As before, we omit the setup and WATCH/MULTI/EXEC handling to focus on the core of what we’re doing.
To see the necessity of locks at scale, let’s take a moment to simulate the marketplace
in a few different loaded scenarios. We’ll have three different runs: one listing and one
buying process, then five listing processes and one buying process, and finally five listing
and five buying processes. Table 6.1 shows the result of running this simulation.
|Listed items||Bought items||Purchase retries||Average wait per purchase|
|1 lister, 1 buyer||145,000||27,000||80,000||14ms|
|5 listers, 1 buyer||331,000||<200||50,000||150ms|
|5 listers, 5 buyers||206,000||<600||161,000||498ms|
As our overloaded system pushes its limits, we go from roughly a 3-to-1 ratio of retries
per completed sale with one listing and buying process, all the way up to 250 retries
for every completed sale. As a result, the latency to complete a sale increases from
under 10 milliseconds in the moderately loaded system, all the way up to nearly 500
milliseconds in the overloaded system. This is a perfect example of why WATCH/MULTI/EXEC transactions sometimes don’t scale at load, and it’s caused by the fact that while
trying to complete a transaction, we fail and have to retry over and over. Keeping our
data correct is important, but so is actually getting work done. To get past this limitation
and actually start performing sales at scale, we must make sure that we only list or
sell one item in the marketplace at any one time. We do this by using a lock.