9.1.1 The ziplist representation
To understand why ziplists may be more efficient, we only need to look at the simplest
of our structures, the LIST. In a typical doubly linked list, we have structures called
nodes, which represent each value in the list. Each of these nodes has pointers to the
previous and next nodes in the list, as well as a pointer to the string in the node. Each
string value is actually stored as three parts: an integer representing the length, an
integer representing the number of remaining free bytes, and the string itself followed
by a null character. An example of this in figure 9.1 shows the three string values
"one", "two", and "ten" as part of a larger linked list.
Ignoring a few details (which only make linked lists look worse), each of these
three strings that are each three characters long will actually take up space for three
pointers, two integers (the length and remaining bytes in the value), plus the string
and an extra byte. On a 32-bit platform, that’s 21 bytes of overhead to store 3 actual
bytes of data (remember, this is an underestimate of what’s actually stored).
On the other hand, the ziplist representation will store a sequence of length, length,
string elements. The first length is the size of the previous entry (for easy scanning in
both directions), the second length is the size of the current entry, and the string is the
stored data itself. There are some other details about what these lengths really mean in
practice, but for these three example strings, the lengths will be 1 byte long, for 2 bytes
of overhead per entry in this example. By not storing additional pointers and metadata,
the ziplist can cut down overhead from 21 bytes each to roughly 2 bytes (in this example).
Let’s see how we can ensure that we’re using the compact ziplist encoding.
USING THE ZIPLIST ENCODING
In order to ensure that these structures are only used when necessary to reduce memory,
Redis includes six configuration options, shown in the following listing, for determining
when the ziplist representation will be used for LISTs, HASHes, and ZSETs.
The basic configuration options for LISTs, HASHes, and ZSETs are all similar, composed
of -max-ziplist-entries settings and -max-ziplist-value settings. Their
semantics are essentially identical in all three cases. The entries settings tell us the
maximum number of items that are allowed in the LIST, HASH, or ZSET for them to be
encoded as a ziplist. The value settings tell us how large in bytes each individual entry
can be. If either of these limits are exceeded, Redis will convert the LIST, HASH, or
ZSET into the nonziplist structure (thus increasing memory).
If we’re using an installation of Redis 2.6 with the default configuration, Redis
should have default settings that are the same as what was provided in listing 9.1. Let’s
play around with ziplist representations of a simple LIST object by adding some items
and checking its representation, as shown in the next listing.
With the new DEBUG OBJECT command at our disposal, discovering whether an object
is stored as a ziplist can be helpful to reduce memory use.
You’ll notice that one structure is obviously missing from the special ziplist encoding,
the SET. SETs also have a compact representation, but different semantics and limits,
which we’ll cover next.