7.2.1 Sorting search results with ZSETs
As we saw in chapter 1 and talked about in chapter 3, SETs can actually be provided as
arguments to the ZSET commands ZINTERSTORE and ZUNIONSTORE. When we pass SETs
to these commands, Redis will consider the SET members to have scores of 1. For now,
we aren’t going to worry about the scores of SETs in our operations, but we will later.
In this section, we’ll talk about using SETs and ZSETs together for a two-part searchand-
sort operation. When you’ve finished reading this section, you’ll understand why
and how we’d want to combine scores together as part of a document search.
Let’s consider a situation in which we’ve already performed a search and have our
result SET. We can sort our results with the SORT command, but that means we can
only sort based on a single value at a time. Being able to easily sort by a single value is
one of the reasons why we started out sorting with our indexes in the first place.
But say that we want to add the ability to vote on our knowledge base articles to
indicate if they were useful. We could put the vote count in the article HASH and use
SORT as we did before. That’s reasonable. But what if we also wanted to sort based on a
combination of recency and votes? We could do as we did in chapter 1 and predefine
the score increase for each vote. But if we don’t have enough information about how
much scores should increase with each vote, then picking a score early on will force us
to have to recalculate later when we find the right number.
Instead, we’ll keep a ZSET of the times that articles were last updated, as well as a
ZSET for the number of votes that an article has received. Both will use the article IDs
of the knowledge base articles as members of the ZSETs, with update times or vote
count as scores, respectively. We’ll also pass similar arguments to an updated search_and_zsort() function defined in the next listing, in order to calculate the
resulting sort order for only update times, only vote counts, or almost any relative balance
between the two.
Our search_and_zsort() works much like the earlier search_and_sort(), differing
primarily in how we sort/order our results. Rather than calling SORT, we perform a
ZINTERSTORE operation, balancing the search result SET, the updated time ZSET, and
the vote ZSET.
As part of search_and_zsort(), we used a helper function for handling the creation
of a temporary ID, the ZINTERSTORE call, and setting the expiration time of the
result ZSET. The zintersect() and zunion() helper functions are shown next.
These helper functions are similar to our SET-based helpers, the primary difference
being that we’re passing a dictionary through to specify scores, so we need to do more
work to properly prefix all of our input keys.
Exercise: Article voting
In this section, we used ZSETs to handle combining a time and a vote count for an
article. You remember that we did much the same thing back in chapter 1 without
search, though we did handle groups of articles. Can you update article_vote(),
post_articles(), get_articles(), and get_group_articles() to use this new
method so that we can update our score per vote whenever we want?
In this section, we talked about how to combine SETs and ZSETs to calculate a simple
composite score based on vote count and updated time. Though we used 2 ZSETs as
sources for scores, there’s no reason why we couldn’t have used 1 or 100. It’s all a question
of what we want to calculate.
If we try to fully replace SORT and HASHes with the more flexible ZSET, we run into one
problem almost immediately: scores in ZSETs must be floating-point numbers. But we
can handle this issue in many cases by converting our non-numeric data to numbers.