e-Book - Redis in Action

This book covers the use of Redis, an in-memory database/data structure server.
  • Foreword
  • Preface
  • Acknowledgments
  • About this Book
  • About the Cover Illustration
  • Part 1: Getting Started
  • Part 2: Core concepts
  • Part 3: Next steps
  • Appendix A
  • Appendix B
  • Buy the paperback

    9.1.2 The intset encoding for SETs

    Like the ziplist for LISTs, HASHes, and ZSETs, there’s also a compact representation
    for short SETs. If our SET members can all be interpreted as base-10 integers within
    the range of our platform’s signed long integer, and our SET is short enough
    (we’ll get to that in a moment), Redis will store our SET as a sorted array of integers,
    or intset.

    By storing a SET as a sorted array, not only do we have low overhead, but all of the
    standard SET operations can be performed quickly. But how big is too big? The next
    listing shows the configuration option for defining an intset’s maximum size.

    Listing 9.3Configuring the maximum size of the intset encoding for SETs
    set-max-intset-entries 512
    

    Limits for intset use with SETs

    As long as we keep our SETs of integers smaller than our configured size, Redis will
    use the intset representation to reduce data size. The following listing shows what happens
    when an intset grows to the point of being too large.

    Listing 9.4When an intset grows to be too large, it’s represented as a hash table.
    >>> conn.sadd('set-object', *range(500))
    500
    >>> conn.debug_object('set-object')
    {'encoding': 'intset', 'refcount': 1, 'lru_seconds_idle': 0,
    

    Let’s add 500 items to the set and see that it’s still encoded as an intset.

    'lru': 283116, 'at': '0xb6d1a1c0', 'serializedlength': 1010,
    'type': 'Value'}
    
    >>> conn.sadd('set-object', *range(500, 1000))
    500
    >>> conn.debug_object('set-object')
    {'encoding': 'hashtable', 'refcount': 1, 'lru_seconds_idle': 0,
    

    When we push it over our configured 512-item limit, the intset is translated into a hash table representation.

    'lru': 283118, 'at': '0xb6d1a1c0', 'serializedlength': 2874,
    'type': 'Value'}
    

    Earlier, in the introduction to section 9.1, I mentioned that to read or update part of
    an object that uses the compact ziplist representation, we may need to decode the
    entire ziplist, and may need to move in-memory data around. For these reasons,
    reading and writing large ziplist-encoded structures can reduce performance. Intsetencoded
    SETs also have similar issues, not so much due to encoding and decoding
    the data, but again because we need to move data around when performing insertions
    and deletions. Next, we’ll examine some performance issues when operating
    with long ziplists.