This book covers the use of Redis, an in-memory database/data structure server.

open all | close all

10.3.1 Scaling search query volume

As we expand our search engine from chapter 7 with SORT, using the ZSET-based
scored search, our ad-targeting search engine (or even the job-search system), at some
point we may come to a point where a single server isn’t capable of handling the number
of queries per second required. In this section, we’ll talk about how to add query
slaves to further increase our capability to serve more search requests.

In section 10.1, you saw how to scale read queries against Redis by adding read
slaves. If you haven’t already read section 10.1, you should do so before continuing.
After you have a collection of read slaves to perform queries against, if you’re running
Redis 2.6 or later, you’ll immediately notice that performing search queries will fail.
This is because performing a search as discussed in chapter 7 requires performing
all of which write to Redis.

In order to perform writes against Redis 2.6 and later, we’ll need to update our Redis
slave configuration. In the Redis configuration file, there’s an option to disable/enable
writing to slaves. This option is called slave-read-only, and it defaults to yes. By changing
slave-read-only to no and restarting our slaves, we should now be able to perform
standard search queries against slave Redis servers. Remember that we cache the results
of our queries, and these cached results are only available on the slave that the queries
were run on. So if we intend to reuse cached results, we’ll probably want to perform
some level of session persistence (where repeated requests from a client go to the same
web server, and that web server always makes requests against the same Redis server).

In the past, I’ve used this method to scale an ad-targeting engine quickly and easily.
If you decide to go this route to scale search queries, remember to pay attention to
the resync issues discussed in section 10.1.

When we have enough memory in one machine and our operations are read-only
(or at least don’t really change the underlying data to be used by other queries), adding
slaves can help us to scale out. But sometimes data volumes can exceed memory
capacity, and we still need to perform complex queries. How can we scale search when
we have more data than available memory?