EBOOK – REDIS IN ACTION

This book covers the use of Redis, an in-memory database/data structure server.

open all | close all

8.4 Posting or deleting a status update

One of the most fundamental operations on a service like Twitter is posting status messages.
People post to share their ideas, and people read because they’re interested in
what’s going on with others. Section 8.1.2 showed how to create a status message as a prerequisite
for knowing the types of data that we’ll be storing, but didn’t show how to get
that status message into a profile timeline or the home timeline of the user’s followers.

In this section, we’ll discuss what happens to a status message when it’s posted so it
can find its way into the home timelines of that user’s followers. We’ll also talk about
how to delete a status message.

You already know how to create the status message itself, but we now need to get
the status message ID into the home timeline of all of our followers. How we should
perform this operation will depend on the number of followers that the posting user
happens to have. If the user has a relatively small number of followers (say, up to 1,000
or so), we can update their home timelines immediately. But for users with larger
number of followers (like 1 million, or even the 25 million that some users have on
Twitter), attempting to perform those insertions directly will take longer than is reasonable
for a user to wait.

To allow for our call to return quickly, we’ll do two things. First, we’ll add the status
ID to the home timelines of the first 1,000 followers as part of the call that posts
the status message. Based on statistics from a site like Twitter, that should handle at
least 99.9% of all users who post (Twitter-wide analytics suggest that there are
roughly 100,000–250,000 users with more than 1,000 followers, which amounts to
roughly .1% of the active user base). This means that only the top .1% of users will
need another step.

Second, for those users with more than 1,000 followers, we’ll start a deferred task
using a system similar to what we built back in section 6.4. The next listing shows the
code for pushing status updates to followers.

Listing 8.6Update a user’s profile timeline
def post_status(conn, uid, message, **data):
    id = create_status(conn, uid, message, **data)

Create a status message using the earlier function.

    if not id:
        return None

If the creation failed, return.

    posted = conn.hget('status:%s'%id, 'posted')

Get the time that the message was posted.

    if not posted:
        return None

If the post wasn’t found, return.

    post = {str(id): float(posted)}
    conn.zadd('profile:%s'%uid, **post)

Add the status message to the user’s profile timeline.

    syndicate_status(conn, uid, post)

Actually push the status message out to the followers of the user.

    return id

Notice that we broke our status updating into two parts. The first part calls the
create_status() function from listing 8.2 to actually create the status message, and
then adds it to the poster’s profile timeline. The second part actually adds the status
message to the timelines of the user’s followers, which can be seen next.

Listing 8.7Update a user’s followers’ home timelines
POSTS_PER_PASS = 1000

Only send to 1000 users per pass.

def syndicate_status(conn, uid, post, start=0):
    followers = conn.zrangebyscore('followers:%s'%uid, start, 'inf',
        start=0, num=POSTS_PER_PASS, withscores=True)

Fetch the next group of 1000 followers, starting at the last person to be updated last time.

    pipeline = conn.pipeline(False)
    for follower, start in followers:

Iterating through the followers results will update the “start” variable, which we can later pass on to subsequent syndicate_status() calls.

        pipeline.zadd('home:%s'%follower, **post)
        pipeline.zremrangebyrank(
            'home:%s'%follower, 0, -HOME_TIMELINE_SIZE-1)

Add the status to the home timelines of all of the fetched followers, and trim the home timelines so they don’t get too big.

    pipeline.execute()
    if len(followers) >= POSTS_PER_PASS:
        execute_later(conn, 'default', 'syndicate_status',
            [conn, uid, post, start])

If at least 1000 followers had received an update, execute the remaining updates in a task.

This second function is what actually handles pushing status messages to the first 1,000
followers’ home timelines, and starts a delayed task using the API we defined in section
6.4 for followers past the first 1,000. With those new functions, we’ve now completed
the tools necessary to actually post a status update and send it to all of a user’s followers.

Exercise: Updating lists

In the last section, I suggested an exercise to build named lists of users. Can you
extend the syndicate_message() function to also support updating the list timelines
from before?

Let’s imagine that we posted a status message that we weren’t proud of; what would we
need to do to delete it?

It turns out that deleting a status message is pretty easy. Before returning the
fetched status messages from a user’s home or profile timeline in get_messages(),
we’re already filtering “empty” status messages with the Python filter() function. So
to delete a status message, we only need to delete the status message HASH and update
the number of status messages posted for the user. The function that deletes a status
message is shown in the following listing.

Listing 8.8A function to delete a previously posted status message
def delete_status(conn, uid, status_id):
    key = 'status:%s'%status_id
    lock = acquire_lock_with_timeout(conn, key, 1)

Acquire a lock around the status object to ensure that no one else is trying to delete it when we are.

    if not lock:
        return None

If we didn’t get the lock, return.

    if conn.hget(key, 'uid') != str(uid):
        return None

If the user doesn’t match the user stored in the status message, return.

    pipeline = conn.pipeline(True)
    pipeline.delete(key)

Delete the status message.

    pipeline.zrem('profile:%s'%uid, status_id)

Remove the status message id from the user’s profile timeline.

        pipeline.zrem('home:%s'%uid, status_id)

Remove the status message ID from the user’s home timeline.

        pipeline.hincrby('user:%s'%uid, 'posts', -1)

Reduce the number of posted messages in the user information HASH.

        pipeline.execute()
        release_lock(conn, key, lock)
        return True

While deleting the status message and updating the status count, we also went ahead
and removed the message from the user’s home timeline and profile timeline.
Though this isn’t technically necessary, it does allow us to keep both of those timelines
a little cleaner without much effort.

Exercise: Cleaning out deleted IDs

As status messages are deleted, “zombie” status message IDs will still be in the
home timelines of all followers. Can you clean out these status IDs? Hint: Think about
how we sent the messages out in the first place. Bonus points: also handle lists.

Being able to post or delete status messages more or less completes the primary functionality
of a Twitter-like social network from a typical user’s perspective. But to complete
the experience, you may want to consider adding a few other features:

  • Private users, along with the ability to request to follow someone
  • Favorites (keeping in mind the privacy of a tweet)
  • Direct messaging between users
  • Replying to messages resulting in conversation flow
  • Reposting/retweeting of messages
  • The ability to @mention users or #tag ideas
  • Keeping a record of who @mentions someone
  • Spam and abuse reporting and controls

These additional features would help to round out the functionality of a site like Twitter,
but may not be necessary in every situation. Expanding beyond those features that
Twitter provides, some social networks have chosen to offer additional functionality
that you may want to consider:

  • Liking/+1 voting status messages
  • Moving status messages around the timeline depending on “importance”
  • Direct messaging between a prespecified group of people (like in section 6.5.2)
  • Groups where users can post to and/or follow a group timeline (public groups, private groups, or even announcement-style groups)

Now that we’ve built the last piece of the standard functional API for actually servicing
a site like Twitter, let’s see what it’d take to build a system for processing streaming API
requests.