EBOOK – REDIS IN ACTION

This book covers the use of Redis, an in-memory database/data structure server.

  • Foreword
  • Preface
  • Acknowledgments
  • About this Book
  • About the Cover Illustration
  • Part 1: Getting Started
  • Part 2: Core concepts
  • Part 3: Next steps
  • Appendix A
  • Appendix B
  • Buy the paperback

    7.2.1 Sorting search results with ZSETs

    As we saw in chapter 1 and talked about in chapter 3, SETs can actually be provided as
    arguments to the ZSET commands ZINTERSTORE and ZUNIONSTORE. When we pass SETs
    to these commands, Redis will consider the SET members to have scores of 1. For now,
    we aren’t going to worry about the scores of SETs in our operations, but we will later.

    In this section, we’ll talk about using SETs and ZSETs together for a two-part searchand-
    sort operation. When you’ve finished reading this section, you’ll understand why
    and how we’d want to combine scores together as part of a document search.

    Let’s consider a situation in which we’ve already performed a search and have our
    result SET. We can sort our results with the SORT command, but that means we can
    only sort based on a single value at a time. Being able to easily sort by a single value is
    one of the reasons why we started out sorting with our indexes in the first place.

    But say that we want to add the ability to vote on our knowledge base articles to
    indicate if they were useful. We could put the vote count in the article HASH and use
    SORT as we did before. That’s reasonable. But what if we also wanted to sort based on a
    combination of recency and votes? We could do as we did in chapter 1 and predefine
    the score increase for each vote. But if we don’t have enough information about how
    much scores should increase with each vote, then picking a score early on will force us
    to have to recalculate later when we find the right number.

    Instead, we’ll keep a ZSET of the times that articles were last updated, as well as a
    ZSET for the number of votes that an article has received. Both will use the article IDs
    of the knowledge base articles as members of the ZSETs, with update times or vote
    count as scores, respectively. We’ll also pass similar arguments to an updated search_and_zsort() function defined in the next listing, in order to calculate the
    resulting sort order for only update times, only vote counts, or almost any relative balance
    between the two.

    Listing 7.6An updated function to search and sort based on votes and updated times
    def search_and_zsort(conn, query, id=None, ttl=300, update=1, vote=0,
                        start=0, num=20, desc=True):
    
    

    Like before, we’ll optionally take a previous result ID for pagination if the result is still available.

        if id and not conn.expire(id, ttl):
            id = None
    
    

    We’ll refresh the search result’s TTL if possible.

        if not id:
            id = parse_and_search(conn, query, ttl=ttl)
    
    

    If our search result expired, or if this is the first time we’ve searched, perform the standard SET search.

            scored_search = {
    
                id: 0,
    

    We use the “id” key for the intersection, but we don’t want it to count toward weights.

                'sort:update': update,
                'sort:votes': vote
    

    Set up the scoring adjustments for balancing update time and votes. Remember: votes can be adjusted to 1, 10, 100, or higher, depending on the sorting result desired.

            }
    
            id = zintersect(conn, scored_search, ttl)
    
    

    Intersect using our helper function that we define in listing 7.7.

        pipeline = conn.pipeline(True)
    
        pipeline.zcard('idx:' + id)
    
    

    Fetch the size of the result ZSET.

        if desc:
            pipeline.zrevrange('idx:' + id, start, start + num - 1)
        else:
            pipeline.zrange('idx:' + id, start, start + num - 1)
    

    Handle fetching a “page” of results.

        results = pipeline.execute()
    
    
        return results[0], results[1], id
    

    Return the results and the ID for pagination.

    Our search_and_zsort() works much like the earlier search_and_sort(), differing
    primarily in how we sort/order our results. Rather than calling SORT, we perform a
    ZINTERSTORE operation, balancing the search result SET, the updated time ZSET, and
    the vote ZSET.

    As part of search_and_zsort(), we used a helper function for handling the creation
    of a temporary ID, the ZINTERSTORE call, and setting the expiration time of the
    result ZSET. The zintersect() and zunion() helper functions are shown next.

    Listing 7.7Some helper functions for performing ZSET intersections and unions
    def _zset_common(conn, method, scores, ttl=30, **kw):
    
        id = str(uuid.uuid4())
    

    Create a new temporary identifier.

        execute = kw.pop('_execute', True)
    

    Allow the passing of an argument to determine whether we should defer pipeline execution.

        pipeline = conn.pipeline(True) if execute else conn
    

    Set up a transactional pipeline so that we have consistent results for each individual call.

        for key in scores.keys():
            scores['idx:' + key] = scores.pop(key)
    

    Add the ‘idx:’ prefix to our inputs.

            getattr(pipeline, method)('idx:' + id, scores, **kw)
    

    Set up the call for one of the operations.

            pipeline.expire('idx:' + id, ttl)
    

    Instruct Redis to expire the ZSET in the future.

            if execute:
                pipeline.execute()
    

    Actually execute the operation, unless explicitly instructed not to by the caller.

            return id
    
    

    Return the ID for the caller to process the results.

    def zintersect(conn, items, ttl=30, **kw):
        return _zset_common(conn, 'zinterstore', dict(items), ttl, **kw)
    
    

    Helper function to perform ZSET intersections.

    def zunion(conn, items, ttl=30, **kw):
        return _zset_common(conn, 'zunionstore', dict(items), ttl, **kw)
    

    Helper function to perform ZSET unions.

    These helper functions are similar to our SET-based helpers, the primary difference
    being that we’re passing a dictionary through to specify scores, so we need to do more
    work to properly prefix all of our input keys.

    Exercise: Article voting

    In this section, we used ZSETs to handle combining a time and a vote count for an
    article. You remember that we did much the same thing back in chapter 1 without
    search, though we did handle groups of articles. Can you update article_vote(),
    post_articles(), get_articles(), and get_group_articles() to use this new
    method so that we can update our score per vote whenever we want?

    In this section, we talked about how to combine SETs and ZSETs to calculate a simple
    composite score based on vote count and updated time. Though we used 2 ZSETs as
    sources for scores, there’s no reason why we couldn’t have used 1 or 100. It’s all a question
    of what we want to calculate.

    If we try to fully replace SORT and HASHes with the more flexible ZSET, we run into one
    problem almost immediately: scores in ZSETs must be floating-point numbers. But we
    can handle this issue in many cases by converting our non-numeric data to numbers.