Documentation - Redise Pack

A guide to Redise Pack installation, operation and administration

open all | close all

5.3.1 Loading the location tables

For development data, I’ve downloaded a free IP-to-city database available from http://dev.maxmind.com/geoip/geolite. This database contains two important files: Geo-
LiteCity-Blocks.csv, which contains information about ranges of IP addresses and city
IDs for those ranges, and GeoLiteCity-Location.csv, which contains a mapping of city
IDs to the city name, the name of the region/state/province, the name of the country,
and some other information that we won’t use.

We’ll first construct the lookup table that allows us to take an IP address and convert
it to a city ID. We’ll then construct a second lookup table that allows us to take the
city ID and convert it to actual city information (city information will also include
region and country information).

The table that allows us to find an IP address and turn it into a city ID will be constructed
from a single ZSET, which has a special city ID as the member, and an integer
value of the IP address as the score. To allow us to map from IP address to city ID, we
convert dotted-quad format IP addresses to an integer score by taking each octet as a
byte in an unsigned 32-bit integer, with the first octet being the highest bits. Code to
perform this operation can be seen here.

Listing 5.9The ip_to_score() function
def ip_to_score(ip_address):
   score = 0
   for v in ip_address.split('.'):
      score = score * 256 + int(v, 10)
   return score

After we have the score, we’ll add the IP address mapping to city IDs first. To construct
a unique city ID from each normal city ID (because multiple IP address ranges can
map to the same city ID), we’ll append a _ character followed by the number of entries
we’ve added to the ZSET already, as can be seen in the next listing.

Listing 5.10The import_ips_to_redis() function
def import_ips_to_redis(conn, filename):

Should be run with the location of the GeoLiteCity-Blocks.csv file.

   csv_file = csv.reader(open(filename, 'rb'))
   for count, row in enumerate(csv_file):
      start_ip = row[0] if row else ''
      if 'i' in start_ip.lower():
         continue
      if '.' in start_ip:
         start_ip = ip_to_score(start_ip)
      elif start_ip.isdigit():
         start_ip = int(start_ip, 10)

Convert the IP address to a score as necessary.

      else:
         continue

Header row or malformed entry.

      city_id = row[2] + '_' + str(count)

Construct the unique city ID.

      conn.zadd('ip2cityid:', city_id, start_ip)

Add the IP address score and city ID.

When our IP addresses have all been loaded by calling import_ips_to_redis(), we’ll
create a ZSET that maps city IDs to city information, as shown in the next listing. We’ll
store the city information as a list encoded with JSON, because all of our entries are of
a fixed format that won’t be changing over time.

Listing 5.11The import_cities_to_redis() function
def import_cities_to_redis(conn, filename):

Should be run with the location of the GeoLiteCity-Location.csv file.

   for row in csv.reader(open(filename, 'rb')):
      if len(row) < 4 or not row[0].isdigit():
         continue
      row = [i.decode('latin-1') for i in row]
      city_id = row[0]
      country = row[1]
      region = row[2]
      city = row[3]

Prepare the information for adding to the ZSET.

      conn.hset('cityid2city:', city_id,
         json.dumps([city, region, country]))

Actually add the city information to Redis.

Now that we have all of our information in Redis, we can start looking up IP addresses.