Documentation - Redise Pack

A guide to Redise Pack installation, operation and administration

open all | close all

9.1.1 The ziplist representation

To understand why ziplists may be more efficient, we only need to look at the simplest of our structures, the LIST. In a typical doubly linked list, we have structures called nodes, which represent each value in the list. Each of these nodes has pointers to the previous and next nodes in the list, as well as a pointer to the string in the node. Each string value is actually stored as three parts: an integer representing the length, an integer representing the number of remaining free bytes, and the string itself followed by a null character. An example of this in figure 9.1 shows the three string values “one”, “two”, and “ten” as part of a larger linked list.

Ignoring a few details (which only make linked lists look worse), each of these three strings that are each three characters long will actually take up space for three pointers, two integers (the length and remaining bytes in the value), plus the string and an extra byte. On a 32-bit platform, that’s 21 bytes of overhead to store 3 actual bytes of data (remember, this is an underestimate of what’s actually stored).

Figure 9.1How long LISTs are stored in Redis

On the other hand, the ziplist representation will store a sequence of length, length, string elements. The first length is the size of the previous entry (for easy scanning in both directions), the second length is the size of the current entry, and the string is the stored data itself. There are some other details about what these lengths really mean in practice, but for these three example strings, the lengths will be 1 byte long, for 2 bytes of overhead per entry in this example. By not storing additional pointers and metadata, the ziplist can cut down overhead from 21 bytes each to roughly 2 bytes (in this example).

Let’s see how we can ensure that we’re using the compact ziplist encoding.

USING THE ZIPLIST ENCODING

In order to ensure that these structures are only used when necessary to reduce memory, Redis includes six configuration options, shown in the following listing, for determining when the ziplist representation will be used for LISTs, HASHes, and ZSETs.

Listing 9.1 Configuration options for the ziplist representation of different structures
list-max-ziplist-entries 512
list-max-ziplist-value 64

Limits for ziplist use with LISTs.

hash-max-ziplist-entries 512
hash-max-ziplist-value 64

Limits for ziplist use with HASHes (previous versions of Redis used a different name and encoding for this).

zset-max-ziplist-entries 128
zset-max-ziplist-value 64

Limits for ziplist use with ZSETs.

The basic configuration options for LISTs, HASHes, and ZSETs are all similar, composed of -max-ziplist-entries settings and -max-ziplist-value settings. Their semantics are essentially identical in all three cases. The entries settings tell us the maximum number of items that are allowed in the LISTHASH, or ZSET for them to be encoded as a ziplist. The value settings tell us how large in bytes each individual entry can be. If either of these limits are exceeded, Redis will convert the LISTHASH, or ZSET into the nonziplist structure (thus increasing memory).

If we’re using an installation of Redis 2.6 with the default configuration, Redis should have default settings that are the same as what was provided in listing 9.1. Let’s play around with ziplist representations of a simple LIST object by adding some items and checking its representation, as shown in the next listing.

Listing 9.2 How to determine whether a structure is stored as a ziplist
>>> conn.rpush('test', 'a', 'b', 'c', 'd')
4

Let’s start by pushing four items onto a LIST.

>>> conn.debug_object('test')

We can discover information about a particular object with the “debug object” command.

{'encoding': 'ziplist', 'refcount': 1, 'lru_seconds_idle': 20,
'lru': 274841, 'at': '0xb6c9f120', 'serializedlength': 24,
'type': 'Value'}

The information we’re looking for is the “encoding” information, which tells us that this is a ziplist, which is using 24 bytes of memory.

>>> conn.rpush('test', 'e', 'f', 'g', 'h')
8

Let’s push four more items onto the LIST.

>>> conn.debug_object('test')
{'encoding': 'ziplist', 'refcount': 1, 'lru_seconds_idle': 0,
'lru': 274846, 'at': '0xb6c9f120', 'serializedlength': 36,

We still have a ziplist, and its size grew to 36 bytes (which is exactly 2 bytes overhead, 1 byte data, for each of the 4 items we just pushed).

'type': 'Value'}
>>> conn.rpush('test', 65*'a')
9
>>> conn.debug_object('test')
{'encoding': 'linkedlist', 'refcount': 1, 'lru_seconds_idle': 10,

When we push an item bigger than what was allowed for the encoding, the LIST gets converted from the ziplist encoding to a standard linked list.

'lru': 274851, 'at': '0xb6c9f120', 'serializedlength': 30,

While the serialized length went down, for nonziplist encodings (except for the special encoding for SETs), this number doesn’t represent the amount of actual memory used by the structure.

'type': 'Value'}
>>> conn.rpop('test')
'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'
>>> conn.debug_object('test')
{'encoding': 'linkedlist', 'refcount': 1, 'lru_seconds_idle': 0,

After a ziplist is converted to a regular structure, it doesn’t get re-encoded as a ziplist if the structure later meets the criteria.

'lru': 274853, 'at': '0xb6c9f120', 'serializedlength': 17,
'type': 'Value'}

With the new DEBUG OBJECT command at our disposal, discovering whether an object is stored as a ziplist can be helpful to reduce memory use.

You’ll notice that one structure is obviously missing from the special ziplist encoding, the SETSETs also have a compact representation, but different semantics and limits, which we’ll cover next.