We now have the ability to arbitrarily search for words in our indexed documents. But
searching is only the first step in retrieving information that we’re looking for. After we
have a list of documents, we need to decide what’s important enough about each of the
documents to determine its position relative to other matching documents. This question
is generally known as relevance in the search world, and one way of determining whether one article is more relevant than another is which article has been updated
more recently. Let’s see how we could include this as part of our search results.
If you remember from chapter 3, the Redis SORT call allows us to sort the contents
of a LIST or SET, possibly referencing external data. For each article in Fake Garage
Startup’s knowledge base, we’ll also include a
HASH that stores information about the article.
The information we’ll store about the article
includes the title, the creation timestamp, the
timestamp for when the article was last
updated, and the document’s ID. An example
document appears in figure 7.4.
With documents stored in this format, we
can then use the SORT command to sort by one
of a few different attributes. We’ve been giving
our result SETs expiration times as a way of
cleaning them out shortly after we’ve finished
using them. But for our final SORTed result, we could keep that result around longer,
while at the same time allowing for the ability to re-sort, and even paginate over the
results without having to perform the search again. Our function for integrating this
kind of caching and re-sorting can be seen in the following listing.
When searching and sorting, we can paginate over results by updating the start and
num arguments; alter the sorting attribute (and order) with the sort argument; cache
the results for longer or shorter with the ttl argument; and reference previous search
results (to save time) with the id argument.
Though these functions won’t let us create a search engine to compete with
Google, this problem and solution are what brought me to use Redis in the first place.
Limitations on SORT lead to using ZSETs to support more intricate forms of document
sorting, including combining scores for a composite sort order.