EBOOK – REDIS IN ACTION

This book covers the use of Redis, an in-memory database/data structure server.

  • Foreword
  • Preface
  • Acknowledgments
  • About this Book
  • About the Cover Illustration
  • Part 1: Getting Started
  • Part 2: Core concepts
  • Part 3: Next steps
  • Appendix A
  • Appendix B
  • Buy the paperback

    5.2.3 Simplifying our statistics recording and discovery

    Now we have our statistics stored in Redis—what next? More specifically, now that we
    have information about (for example) access time on every page, how do we discover
    which pages take a long time on average to generate? Or how do we know when it
    takes significantly longer to generate a page than it did on previous occasions? The
    simple answer is that we need to store more information in a way that lets us discover
    when both situations happen, which we’ll explore in this section.

    If we want to record access times, then we need to calculate access times. We can
    spend our time adding access time calculations in various places and then adding
    code to record the access times, or we can implement something to help us to calculate
    and record the access times. That same helper could then also make that information
    available in (for example) a ZSET of the slowest pages to access on average, and
    could even report on pages that take a long time to access compared to other times
    that page was accessed.

    To help us calculate and record access times, we’ll write a Python context manager1
    that will wrap our code that we want to calculate and record access times for.
    This context manager will get the current time, let the wrapped code execute, and
    then calculate the total time of execution, record it in Redis, and also update a ZSET of
    the highest access time contexts. The next listing shows our context manager for performing
    this set of operations.

    Listing 5.8The access_time() context manager
    @contextlib.contextmanager
    

    Make this Python generator into a context manager.

    def access_time(conn, context):
    

       start = time.time()
    

    Record the start time.

       yield
    
    

    Let the block of code that we’re wrapping run.

       delta = time.time() - start
    

    Calculate the time that the block took to execute.

       stats = update_stats(conn, context, 'AccessTime', delta)
    

    Update the stats for this context.

       average = stats[1] / stats[0]
    
    

    Calculate the average.

       pipe = conn.pipeline(True)
    

       pipe.zadd('slowest:AccessTime', context, average)
    

    Add the average to a ZSET that holds the slowest access times.

       pipe.zremrangebyrank('slowest:AccessTime', 0, -101)
    

    Keep the slowest 100 items in the AccessTime ZSET.

       pipe.execute()
    

    There’s some magic going on in the access_time() context manager, and it’ll probably
    help to see it in use to understand what’s going on. The following code shows the
    access_time() context manager being used to record access times of web pages that
    are served through a similar kind of callback method as part of a middleware layer or
    plugin that was used in our examples from chapter 2:

    def process_view(conn, callback):
    

    This web view takes the Redis connection as well as a callback to generate content.

       with access_time(conn, request.path):
    

    This is how we’d use the access time context manager to wrap a block of code.

          return callback()
    

    This is executed when the yield statement is hit from within the context manager.

    After seeing the example, even if you don’t yet understand how to create a context
    manager, you should at least know how to use one. In this example, we used the access
    time context manager to calculate the total time to generate a web page. This context
    manager could also be used to record the time it takes to make a database query or
    the amount of time it takes to render a template. As an exercise, can you think of
    other types of context managers that could record statistics that would be useful? Or
    can you add reporting of access times that are more than two standard deviations
    above average to the recent_log()?

    GATHERING STATISTICS AND COUNTERS IN THE REAL WORLDI know that we just
    spent several pages talking about how to gather fairly important statistics about
    how our production systems operate, but let me remind you that there are preexisting
    software packages designed for collecting and plotting counters and statistics. My personal favorite is Graphite (http://graphite.wikidot.com/), which you should probably download and install before spending too much time building your own data-plotting library.

    Now that we’ve been recording diverse and important information about the state of
    our application into Redis, knowing more about our visitors can help us to answer
    other questions.

    1 In Python, a context manager is a specially defined function or class that will have parts of it executed before
    and after a given block of code is executed. This allows, for example, the easy opening and automatic closing
    of files.