9.3.3 Calculating aggregates over sharded STRINGs
To calculate aggregates, we have two use cases. Either we’ll calculate aggregates over
all of the information we know about, or we’ll calculate over a subset. We’ll start by calculating
aggregates over the entire population, and then we’ll write code that calculates
aggregates over a smaller group.
To calculate aggregates over everyone we have information for, we’ll recycle some
code that we wrote back in section 6.6.4, specifically the readblocks() function,
which reads blocks of data from a given key. Using this function, we can perform a single
command and round trip with Redis to fetch information about thousands of
users at one time. Our function to calculate aggregates with this block-reading function
is shown next.
This function to calculate aggregates over country- and state-level information for
everyone uses a structure called a defaultdict, which we also first used in chapter 6
to calculate aggregates about location information before writing back to Redis.
Inside this function, we refer to a helper function that actually updates the aggregates
and decodes location codes back into their original ISO3 country codes and local state
abbreviations, which can be seen in this next listing.
With a function to convert location codes back into useful location information and
update aggregate information, we have the building blocks to perform aggregates
over a subset of users. As an example, say that we have location information for many
Twitter users. And also say that we have follower information for each user. To discover
information about where the followers of a given user are located, we’d only
need to fetch location information for those users and compute aggregates similar to
our global aggregates. The next listing shows a function that will aggregate location
information over a provided list of user IDs.
This technique of storing fixed-length data in sharded STRINGs can be useful. Though
we stored multiple bytes of data per user, we can use GETBIT and SETBIT identically to
store individual bits, or even groups of bits.