EBOOK – REDIS IN ACTION

This book covers the use of Redis, an in-memory database/data structure server.

  • Foreword
  • Preface
  • Acknowledgments
  • About this Book
  • About the Cover Illustration
  • Part 1: Getting Started
  • Part 2: Core concepts
  • Part 3: Next steps
  • Appendix A
  • Appendix B
  • Buy the paperback

    7.4.2 Approaching the problem like search

    In section 7.3.3, we used SETs and ZSETs as holders for additive bonuses for optional
    targeting parameters. If we’re careful, we can do the same thing for groups of
    required targeting parameters.

    Rather than talk about jobs with skills, we need to flip the problem around like we
    did with the other search problems described in this chapter. We start with one SET
    per skill, which stores all of the jobs that require that skill. In a required skills ZSET, we
    store the total number of skills that a job requires. The code that SETs up our index
    looks like the next listing.

    Listing 7.18A function for indexing jobs based on the required skills
    def index_job(conn, job_id, skills):
        pipeline = conn.pipeline(True)
        for skill in skills:
    
            pipeline.sadd('idx:skill:' + skill, job_id)
    

    Add the job ID to all appropriate skill SETs.

        pipeline.zadd('idx:jobs:req', job_id, len(set(skills)))
    

    Add the total required skill count to the required skills ZSET.

        pipeline.execute()
    

    This indexing function should remind you of the text indexing function we used in
    section 7.1. The only major difference is that we’re providing index_job() with pretokenized
    skills, and we’re adding a member to a ZSET that keeps a record of the number
    of skills that each job requires.

    To perform a search for jobs that a candidate has all of the skills for, we need to
    approach the search like we did with the bonuses to ad targeting in section 7.3.3.
    More specifically, we’ll perform a ZUNIONSTOREoperation over skill SETs to calculate a
    total score for each job. This score represents how many skills the candidate has for
    each of the jobs.

    Because we have a ZSET with the total number of skills required, we can then perform
    a ZINTERSTORE operation between the candidate’s ZSET and the required skills
    ZSET with weights -1 and 1, respectively. Any job ID with a score equal to 0 in that final
    result ZSET is a job that the candidate has all of the required skills for. The code for
    implementing the search operation is shown in the following listing.

    Listing 7.19Find all jobs that a candidate is qualified for
    def find_jobs(conn, candidate_skills):
    
        skills = {}
        for skill in set(candidate_skills):
            skills['skill:' + skill] = 1
    
    

    Set up the dictionary for scoring the jobs.

        job_scores = zunion(conn, skills)
    

    Calculate the scores for each of the jobs.

        final_result = zintersect(
            conn, {job_scores:-1, 'jobs:req':1})
    
    

    Calculate how many more skills the job requires than the candidate has.

        return conn.zrangebyscore('idx:' + final_result, 0, 0)
    

    Return the jobs that the candidate has the skills for.

    Again, we first find the scores for each job. After we have the scores for each job, we
    subtract each job score from the total score necessary to match. In that final result,
    any job with a ZSET score of 0 is a job that the candidate has all of the skills for.

    Depending on the number of jobs and searches that are being performed, our jobsearch
    system may or may not perform as fast as we need it to, especially with large
    numbers of jobs or searches. But if we apply sharding techniques that we’ll discuss in
    chapter 9, we can break the large calculations into smaller pieces and calculate partial
    results bit by bit. Alternatively, if we first find the SET of jobs in a location to search for
    jobs, we could perform the same kind of optimization that we performed with ad targeting
    in section 7.3.3, which could greatly improve job-search performance.

    Exercise: Levels of experience

    A natural extension to the simple required skills listing is an understanding that skill
    levels vary from beginner to intermediate, to expert, and beyond. Can you come up
    with a method using additional SETs to offer the ability, for example, for someone
    who has as intermediate level in a skill to find jobs that require either beginner or
    intermediate-level candidates?

    Exercise: Years of experience

    Levels of expertise can be useful, but another way to look at the amount of experience
    someone has is the number of years they’ve used it. Can you build an alternate
    version that supports handling arbitrary numbers of years of experience?