e-Book - Redis in Action

This book covers the use of Redis, an in-memory database/data structure server.
  • Foreword
  • Preface
  • Acknowledgments
  • About this Book
  • About the Cover Illustration
  • Part 1: Getting Started
  • Part 2: Core concepts
  • Part 3: Next steps
  • Appendix A
  • Appendix B
  • Buy the paperback

    8.4 Posting or deleting a status update

    One of the most fundamental operations on a service like Twitter is posting status messages.
    People post to share their ideas, and people read because they’re interested in
    what’s going on with others. Section 8.1.2 showed how to create a status message as a prerequisite
    for knowing the types of data that we’ll be storing, but didn’t show how to get
    that status message into a profile timeline or the home timeline of the user’s followers.

    In this section, we’ll discuss what happens to a status message when it’s posted so it
    can find its way into the home timelines of that user’s followers. We’ll also talk about
    how to delete a status message.

    You already know how to create the status message itself, but we now need to get
    the status message ID into the home timeline of all of our followers. How we should
    perform this operation will depend on the number of followers that the posting user
    happens to have. If the user has a relatively small number of followers (say, up to 1,000
    or so), we can update their home timelines immediately. But for users with larger
    number of followers (like 1 million, or even the 25 million that some users have on
    Twitter), attempting to perform those insertions directly will take longer than is reasonable
    for a user to wait.

    To allow for our call to return quickly, we’ll do two things. First, we’ll add the status
    ID to the home timelines of the first 1,000 followers as part of the call that posts
    the status message. Based on statistics from a site like Twitter, that should handle at
    least 99.9% of all users who post (Twitter-wide analytics suggest that there are
    roughly 100,000–250,000 users with more than 1,000 followers, which amounts to
    roughly .1% of the active user base). This means that only the top .1% of users will
    need another step.

    Second, for those users with more than 1,000 followers, we’ll start a deferred task
    using a system similar to what we built back in section 6.4. The next listing shows the
    code for pushing status updates to followers.

    Listing 8.6Update a user’s profile timeline
    def post_status(conn, uid, message, **data):
    
        id = create_status(conn, uid, message, **data)
    

    Create a status message using the earlier function.

        if not id:
            return None

    If the creation failed, return.

        posted = conn.hget('status:%s'%id, 'posted')
    

    Get the time that the message was posted.

        if not posted:
            return None

    If the post wasn’t found, return.

        post = {str(id): float(posted)}
    
        conn.zadd('profile:%s'%uid, **post)

    Add the status message to the user’s profile timeline.

        syndicate_status(conn, uid, post)
    

    Actually push the status message out to the followers of the user.

        return id
    

    Notice that we broke our status updating into two parts. The first part calls the
    create_status() function from listing 8.2 to actually create the status message, and
    then adds it to the poster’s profile timeline. The second part actually adds the status
    message to the timelines of the user’s followers, which can be seen next.

    Listing 8.7Update a user’s followers’ home timelines
    POSTS_PER_PASS = 1000
    

    Only send to 1000 users per pass.

    def syndicate_status(conn, uid, post, start=0):
    
        followers = conn.zrangebyscore('followers:%s'%uid, start, 'inf',
            start=0, num=POSTS_PER_PASS, withscores=True)

    Fetch the next group of 1000 followers, starting at the last person to be updated last time.

        pipeline = conn.pipeline(False)
    
        for follower, start in followers:
    

    Iterating through the followers results will update the “start” variable, which we can later pass on to subsequent syndicate_status() calls.

            pipeline.zadd('home:%s'%follower, **post)
            pipeline.zremrangebyrank(
                'home:%s'%follower, 0, -HOME_TIMELINE_SIZE-1)
    

    Add the status to the home timelines of all of the fetched followers, and trim the home timelines so they don’t get too big.

        pipeline.execute()
        if len(followers) >= POSTS_PER_PASS:
            execute_later(conn, 'default', 'syndicate_status',
                [conn, uid, post, start])
    

    If at least 1000 followers had received an update, execute the remaining updates in a task.

    This second function is what actually handles pushing status messages to the first 1,000
    followers’ home timelines, and starts a delayed task using the API we defined in section
    6.4 for followers past the first 1,000. With those new functions, we’ve now completed
    the tools necessary to actually post a status update and send it to all of a user’s followers.

    Exercise: Updating lists

    In the last section, I suggested an exercise to build named lists of users. Can you
    extend the syndicate_message() function to also support updating the list timelines
    from before?

    Let’s imagine that we posted a status message that we weren’t proud of; what would we
    need to do to delete it?

    It turns out that deleting a status message is pretty easy. Before returning the
    fetched status messages from a user’s home or profile timeline in get_messages(),
    we’re already filtering “empty” status messages with the Python filter() function. So
    to delete a status message, we only need to delete the status message HASH and update
    the number of status messages posted for the user. The function that deletes a status
    message is shown in the following listing.

    Listing 8.8A function to delete a previously posted status message
    def delete_status(conn, uid, status_id):
        key = 'status:%s'%status_id
    
        lock = acquire_lock_with_timeout(conn, key, 1)
    

    Acquire a lock around the status object to ensure that no one else is trying to delete it when we are.

        if not lock:
            return None

    If we didn’t get the lock, return.

        if conn.hget(key, 'uid') != str(uid):
            return None

    If the user doesn’t match the user stored in the status message, return.

        pipeline = conn.pipeline(True)
    
        pipeline.delete(key)
    

    Delete the status message.

        pipeline.zrem('profile:%s'%uid, status_id)
    

    Remove the status message id from the user’s profile timeline.

            pipeline.zrem('home:%s'%uid, status_id)
    

    Remove the status message ID from the user’s home timeline.

            pipeline.hincrby('user:%s'%uid, 'posts', -1)
    

    Reduce the number of posted messages in the user information HASH.

            pipeline.execute()
            release_lock(conn, key, lock)
    
            return True
    

    While deleting the status message and updating the status count, we also went ahead
    and removed the message from the user’s home timeline and profile timeline.
    Though this isn’t technically necessary, it does allow us to keep both of those timelines
    a little cleaner without much effort.

    Exercise: Cleaning out deleted IDs

    As status messages are deleted, “zombie” status message IDs will still be in the
    home timelines of all followers. Can you clean out these status IDs? Hint: Think about
    how we sent the messages out in the first place. Bonus points: also handle lists.

    Being able to post or delete status messages more or less completes the primary functionality
    of a Twitter-like social network from a typical user’s perspective. But to complete
    the experience, you may want to consider adding a few other features:

    • Private users, along with the ability to request to follow someone
    • Favorites (keeping in mind the privacy of a tweet)
    • Direct messaging between users
    • Replying to messages resulting in conversation flow
    • Reposting/retweeting of messages
    • The ability to @mention users or #tag ideas
    • Keeping a record of who @mentions someone
    • Spam and abuse reporting and controls

    These additional features would help to round out the functionality of a site like Twitter,
    but may not be necessary in every situation. Expanding beyond those features that
    Twitter provides, some social networks have chosen to offer additional functionality
    that you may want to consider:

    • Liking/+1 voting status messages
    • Moving status messages around the timeline depending on “importance”
    • Direct messaging between a prespecified group of people (like in section 6.5.2)
    • Groups where users can post to and/or follow a group timeline (public groups, private groups, or even announcement-style groups)

    Now that we’ve built the last piece of the standard functional API for actually servicing
    a site like Twitter, let’s see what it’d take to build a system for processing streaming API
    requests.