EBOOK – REDIS IN ACTION

This book covers the use of Redis, an in-memory database/data structure server.

  • Foreword
  • Preface
  • Acknowledgments
  • About this Book
  • About the Cover Illustration
  • Part 1: Getting Started
  • Part 2: Core concepts
  • Part 3: Next steps
  • Appendix A
  • Appendix B
  • Buy the paperback

    5.3.1 Loading the location tables

    For development data, I’ve downloaded a free IP-to-city database available from http://dev.maxmind.com/geoip/geolite. This database contains two important files: Geo-
    LiteCity-Blocks.csv, which contains information about ranges of IP addresses and city
    IDs for those ranges, and GeoLiteCity-Location.csv, which contains a mapping of city
    IDs to the city name, the name of the region/state/province, the name of the country,
    and some other information that we won’t use.

    We’ll first construct the lookup table that allows us to take an IP address and convert
    it to a city ID. We’ll then construct a second lookup table that allows us to take the
    city ID and convert it to actual city information (city information will also include
    region and country information).

    The table that allows us to find an IP address and turn it into a city ID will be constructed
    from a single ZSET, which has a special city ID as the member, and an integer
    value of the IP address as the score. To allow us to map from IP address to city ID, we
    convert dotted-quad format IP addresses to an integer score by taking each octet as a
    byte in an unsigned 32-bit integer, with the first octet being the highest bits. Code to
    perform this operation can be seen here.

    Listing 5.9The ip_to_score() function
    def ip_to_score(ip_address):
       score = 0
       for v in ip_address.split('.'):
          score = score * 256 + int(v, 10)
       return score
    

    After we have the score, we’ll add the IP address mapping to city IDs first. To construct
    a unique city ID from each normal city ID (because multiple IP address ranges can
    map to the same city ID), we’ll append a _ character followed by the number of entries
    we’ve added to the ZSET already, as can be seen in the next listing.

    Listing 5.10The import_ips_to_redis() function
    def import_ips_to_redis(conn, filename):
    

    Should be run with the location of the GeoLiteCity-Blocks.csv file.

       csv_file = csv.reader(open(filename, 'rb'))
       for count, row in enumerate(csv_file):
    

          start_ip = row[0] if row else ''
          if 'i' in start_ip.lower():
             continue
          if '.' in start_ip:
             start_ip = ip_to_score(start_ip)
          elif start_ip.isdigit():
             start_ip = int(start_ip, 10)
    

    Convert the IP address to a score as necessary.

          else:
    

             continue
    
    

    Header row or malformed entry.

          city_id = row[2] + '_' + str(count)
    

    Construct the unique city ID.

          conn.zadd('ip2cityid:', city_id, start_ip)
    

    Add the IP address score and city ID.

    When our IP addresses have all been loaded by calling import_ips_to_redis(), we’ll
    create a ZSET that maps city IDs to city information, as shown in the next listing. We’ll
    store the city information as a list encoded with JSON, because all of our entries are of
    a fixed format that won’t be changing over time.

    Listing 5.11The import_cities_to_redis() function
    def import_cities_to_redis(conn, filename):
    

    Should be run with the location of the GeoLiteCity-Location.csv file.

       for row in csv.reader(open(filename, 'rb')):
          if len(row) < 4 or not row[0].isdigit():
             continue
    

          row = [i.decode('latin-1') for i in row]
    

          city_id = row[0]
          country = row[1]
          region = row[2]
          city = row[3]
    

    Prepare the information for adding to the ZSET.

          conn.hset('cityid2city:', city_id,
             json.dumps([city, region, country]))
    

    Actually add the city information to Redis.

    Now that we have all of our information in Redis, we can start looking up IP addresses.