EBOOK – REDIS IN ACTION

This book covers the use of Redis, an in-memory database/data structure server.

open all | close all

9.1.1 The ziplist representation

To understand why ziplists may be more efficient, we only need to look at the simplest
of our structures, the LIST. In a typical doubly linked list, we have structures called
nodes, which represent each value in the list. Each of these nodes has pointers to the
previous and next nodes in the list, as well as a pointer to the string in the node. Each
string value is actually stored as three parts: an integer representing the length, an
integer representing the number of remaining free bytes, and the string itself followed
by a null character. An example of this in figure 9.1 shows the three string values
"one", "two", and "ten" as part of a larger linked list.

Ignoring a few details (which only make linked lists look worse), each of these
three strings that are each three characters long will actually take up space for three
pointers, two integers (the length and remaining bytes in the value), plus the string
and an extra byte. On a 32-bit platform, that’s 21 bytes of overhead to store 3 actual
bytes of data (remember, this is an underestimate of what’s actually stored).

Figure 9.1How long LISTs are stored in Redis

On the other hand, the ziplist representation will store a sequence of length, length,
string elements. The first length is the size of the previous entry (for easy scanning in
both directions), the second length is the size of the current entry, and the string is the
stored data itself. There are some other details about what these lengths really mean in
practice, but for these three example strings, the lengths will be 1 byte long, for 2 bytes
of overhead per entry in this example. By not storing additional pointers and metadata,
the ziplist can cut down overhead from 21 bytes each to roughly 2 bytes (in this example).

Let’s see how we can ensure that we’re using the compact ziplist encoding.

USING THE ZIPLIST ENCODING

In order to ensure that these structures are only used when necessary to reduce memory,
Redis includes six configuration options, shown in the following listing, for determining
when the ziplist representation will be used for LISTs, HASHes, and ZSETs.

Listing 9.1Configuration options for the ziplist representation of different structures
list-max-ziplist-entries 512
list-max-ziplist-value 64

Limits for ziplist use with LISTs.

hash-max-ziplist-entries 512
hash-max-ziplist-value 64

Limits for ziplist use with HASHes (previous versions of Redis used a different name and encoding for this).

zset-max-ziplist-entries 128
zset-max-ziplist-value 64

Limits for ziplist use with ZSETs.

The basic configuration options for LISTs, HASHes, and ZSETs are all similar, composed
of -max-ziplist-entries settings and -max-ziplist-value settings. Their
semantics are essentially identical in all three cases. The entries settings tell us the
maximum number of items that are allowed in the LIST, HASH, or ZSET for them to be
encoded as a ziplist. The value settings tell us how large in bytes each individual entry
can be. If either of these limits are exceeded, Redis will convert the LIST, HASH, or
ZSET into the nonziplist structure (thus increasing memory).

If we’re using an installation of Redis 2.6 with the default configuration, Redis
should have default settings that are the same as what was provided in listing 9.1. Let’s
play around with ziplist representations of a simple LIST object by adding some items
and checking its representation, as shown in the next listing.

Listing 9.2How to determine whether a structure is stored as a ziplist
>>> conn.rpush('test', 'a', 'b', 'c', 'd')
4

Let’s start by pushing four items onto a LIST.

>>> conn.debug_object('test')

We can discover information about a particular object with the “debug object” command.

{'encoding': 'ziplist', 'refcount': 1, 'lru_seconds_idle': 20,
'lru': 274841, 'at': '0xb6c9f120', 'serializedlength': 24,
'type': 'Value'}

The information we’re looking for is the “encoding” information, which tells us that this is a ziplist, which is using 24 bytes of memory.

>>> conn.rpush('test', 'e', 'f', 'g', 'h')
8

Let’s push four more items onto the LIST.

>>> conn.debug_object('test')
{'encoding': 'ziplist', 'refcount': 1, 'lru_seconds_idle': 0,
'lru': 274846, 'at': '0xb6c9f120', 'serializedlength': 36,

We still have a ziplist, and its size grew to 36 bytes (which is exactly 2 bytes overhead, 1 byte data, for each of the 4 items we just pushed).

'type': 'Value'}
>>> conn.rpush('test', 65*'a')
9
>>> conn.debug_object('test')
{'encoding': 'linkedlist', 'refcount': 1, 'lru_seconds_idle': 10,

When we push an item bigger than what was allowed for the encoding, the LIST gets converted from the ziplist encoding to a standard linked list.

'lru': 274851, 'at': '0xb6c9f120', 'serializedlength': 30,

While the serialized length went down, for nonziplist encodings (except for the special encoding for SETs), this number doesn’t represent the amount of actual memory used by the structure.

'type': 'Value'}
>>> conn.rpop('test')
'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'
>>> conn.debug_object('test')
{'encoding': 'linkedlist', 'refcount': 1, 'lru_seconds_idle': 0,

After a ziplist is converted to a regular structure, it doesn’t get re-encoded as a ziplist if the structure later meets the criteria.

'lru': 274853, 'at': '0xb6c9f120', 'serializedlength': 17,
'type': 'Value'}

With the new DEBUG OBJECT command at our disposal, discovering whether an object
is stored as a ziplist can be helpful to reduce memory use.

You’ll notice that one structure is obviously missing from the special ziplist encoding,
the SET. SETs also have a compact representation, but different semantics and limits,
which we’ll cover next.