e-Book - Redis in Action

This book covers the use of Redis, an in-memory database/data structure server.
  • Foreword
  • Preface
  • Acknowledgments
  • About this Book
  • About the Cover Illustration
  • Part 1: Getting Started
  • Part 2: Core concepts
  • Part 3: Next steps
  • Appendix A
  • Appendix B
  • Buy the paperback

    6.3.1 Building a basic counting semaphore

    When building a counting semaphore, we run into many of the same concerns we had
    with other types of locking. We must decide who got the lock, how to handle processes
    that crashed with the lock, and how to handle timeouts. If we don’t care about timeouts,

    or handling the case where semaphore holders can crash without releasing semaphores,
    we could build semaphores fairly conveniently in a few different ways. Unfortunately,
    those methods don’t lead us to anything useful in the long term, so I’ll describe one
    method that we’ll incrementally improve to offer a full range of functionality.

    Figure 6.6Basic semaphore ZSET

    In almost every case where we want to deal with timeouts in Redis, we’ll generally
    look to one of two different methods. Either we’ll use EXPIRE like we did with our
    standard locks, or we’ll use ZSETs. In this case, we want to use ZSETs, because that
    allows us to keep information about multiple semaphore holders in a single structure.

    More specifically, for each process that attempts to acquire the semaphore, we’ll
    generate a unique identifier. This identifier will be the member of a ZSET. For the
    score, we’ll use the timestamp for when the process attempted to acquire the semaphore.
    Our semaphore ZSET will look something like figure 6.6.

    When a process wants to attempt to acquire a semaphore, it first generates an identifier,
    and then the process adds the identifier to the ZSET using the current timestamp
    as the score. After adding the identifier, the process then checks for its identifier’s rank.
    If the rank returned is lower than the total allowed count (Redis uses 0-indexing on
    rank), then the caller has acquired the semaphore. Otherwise, the caller doesn’t have
    the semaphore and must delete its identifier from the ZSET. To handle timeouts, before
    adding our identifier to the ZSET, we first clear out any entries that have timestamps
    that are older than our timeout number value. The code to acquire the semaphore can
    be seen next.

    Listing 6.12The acquire_semaphore() function
    def acquire_semaphore(conn, semname, limit, timeout=10):
    
       identifier = str(uuid.uuid4())
    

    A 128-bit random identifier.

       now = time.time()
    
    
       pipeline = conn.pipeline(True)
    
       pipeline.zremrangebyscore(semname, '-inf', now - timeout)
    

    Time out old semaphore holders.

       pipeline.zadd(semname, identifier, now)
    

    Try to acquire the semaphore.

       pipeline.zrank(semname, identifier)
       if pipeline.execute()[-1] < limit:
    

    Check to see if we have it.

          return identifier
    
    
       conn.zrem(semname, identifier)
    

    We failed to get the semaphore; discard our identifier.

       return None
    

    Our code proceeds as I’ve already described: generating the identifier, cleaning out
    any timed-out semaphores, adding its identifier to the ZSET, and checking its rank.
    Not too surprising.

    Releasing the semaphore is easy: we remove the identifier from the ZSET, as can be
    seen in the next listing.

    Listing 6.13The release_semaphore() function
    def release_semaphore(conn, semname, identifier):
    
       return conn.zrem(semname, identifier) 
    

    Returns True if the semaphore was properly released, False if it had timed out

    This basic semaphore works well—it’s simple, and it’s very fast. But relying on every
    process having access to the same system time in order to get the semaphore can
    cause problems if we have multiple hosts. This isn’t a huge problem for our specific
    use case, but if we had two systems A and B, where A ran even 10 milliseconds faster
    than B, then if A got the last semaphore, and B tried to get a semaphore within 10 milliseconds,
    B would actually “steal” A’s semaphore without A knowing it.

    Any time we have a lock or a semaphore where such a slight difference in the system
    clock can drastically affect who can get the lock, the lock or semaphore is considered
    unfair. Unfair locks and semaphores can cause clients that should’ve gotten the lock or
    semaphore to never get it, and this is something that we’ll fix in the next section.