EBOOK – REDIS IN ACTION

This book covers the use of Redis, an in-memory database/data structure server.

open all | close all

1.3.1 Voting on articles

First, let’s start with some numbers and limitations on our problem, so we can solve the problem without losing sight of what we’re trying to do. Let’s say that 1,000 articles are submitted each day. Of those 1,000 articles, about 50 of them are interesting enough that we want them to be in the top-100 articles for at least one day. All of those 50 articles will receive at least 200 up votes. We won’t worry about down votes for this version.

Figure 1.6Reddit, a site that offers the ability to vote on articles
Figure 1.7Stack Overflow, a site that offers the ability to vote on questions

When dealing with scores that go down over time, we need to make the posting time, the current time, or both relevant to the overall score. To keep things simple, we’ll say that the score of an item is a function of the time that the article was posted, plus a constant multiplier times the number of votes that the article has received.

Figure 1.8An example article stored as a HASH for our article voting system

The time we’ll use the number of seconds since January 1, 1970, in the UTC time zone, which is commonly referred to as Unix time. We’ll use Unix time because it can be fetched easily in most programming languages and on every platform that we may use Redis on. For our constant, we’ll take the number of seconds in a day (86,400) divided by the number of votes required (200) to last a full day, which gives us 432 “points” added to the score per vote.

To actually build this, we need to start thinking of structures to use in Redis. For starters, we need to store article information like the title, the link to the article, who posted it, the time it was posted, and the number of votes received. We can use a Redis HASH to store this information, and an example article can be seen in figure 1.8.

Using The Colon Character as a SeparatorThroughout this and other chapters, you’ll find that we use the colon character (:) as a separator between parts of names; for example, in figure 1.8, we used it to separate article from the ID of the article, creating a sort of namespace. The choice of : is subjective, but common among Redis users. Other common choices include a period (.), forward slash (/), and even occasionally the pipe character (|). Regardless of what you choose, be consistent, and note how we use colons to define nested namespaces throughout the examples in the book.

To store a sorted set of articles themselves, we’ll use a ZSET, which keeps items ordered by the item scores. We can use our article ID as the member, with the ZSET score being the article score itself. While we’re at it, we’ll create another ZSET with the score being just the times that the articles were posted, which gives us an option of browsing articles based on article score or time. We can see a small example of time- and score- ordered article ZSETs in figure 1.9.

Figure 1.9Two sorted sets representing time-ordered and score-ordered article indexes
Figure 1.10Some users who have voted for article 100408

In order to prevent users from voting for the same article more than once, we need to store a unique listing of users who have voted for each article. For this, we’ll use a SET for each article, and store the member IDs of all users who have voted on the given article. An example SET of users who have voted on an article is shown in figure 1.10.

For the sake of memory use over time, we’ll say that after a week users can no longer vote on an article and its score is fixed. After that week has passed, we’ll delete the SET of users who have voted on the article.

Before we build this, let’s take a look at what would happen if user 115423 were to vote for article 100408 in figure 1.11.

Figure 1.11What happens to our structures when user 115423 votes for article 100408

Now that you know what we’re going to build, let’s build it! First, let’s handle voting. When someone tries to vote on an article, we first verify that the article was posted within the last week by checking the article’s post time with ZSCORE. If we still have time, we then try to add the user to the article’s voted SET with SADD. Finally, if the user didn’t previously vote on that article, we increment the score of the article by 432 (which we calculated earlier) with ZINCRBY (a command that increments the score of a member), and update the vote count in the HASH with HINCRBY (a command that increments a value in a hash). The voting code is shown in listing 1.6.

Listing 1.6
The article_vote() function
ONE_WEEK_IN_SECONDS = 7 * 86400
VOTE_SCORE = 432

Prepare our constants.

def article_vote(conn, user, article):

	cutoff = time.time() - ONE_WEEK_IN_SECONDS

Calculate the cutoff time for voting.

	if conn.zscore('time:', article) < cutoff:

Check to see if the article can still be voted on (we could use the article HASH here, but scores are returned as floats so we don’t have to cast it).

		return

	article_id = article.partition(':')[-1]

Get the id portion from the article:id identifier.

	if conn.sadd('voted:' + article_id, user):
		conn.zincrby('score:', article, VOTE_SCORE)
		conn.hincrby(article, 'votes', 1)

If the user hasn’t voted for this article before, increment the article score and vote count.


Redis TransactionsIn order to be correct, technically our SADD, ZINCRBY, and HINCRBY calls should be in a transaction. But since we don’t cover transactions until chapter 4, we won’t worry about them for now.

Voting isn’t so bad, is it? But what about posting an article?