Documentation - Redise Pack

A guide to Redise Pack installation, operation and administration

open all | close all

Database Persistence with Redise Pack

As Redise Pack is not just a caching solution, but also a full fledged database, it supports persisting your data to disk on a per database basis and in multiple ways. Redise Pack two options for persistence, Append Only File (AOF) and Snapshot (RDB).

Data persistence, via AOF or snapshots, is used solely to restore the database if it fails. This is necessary as Redis is an in-memory datastore and when the process stops or the server crashes, everything in RAM is lost. Data persistence is optional and can be set to none if you so desire.

AOF writes the latest ‘write’ commands into a file every second. As a comparison, AOF resembles a traditional RDBMS’s redo log, if you are familiar with those. This file can be ‘replayed’ in order to recover from a crash.

A snapshot (RDB) on the hand, is performed every one, six, or twelve hours. The snapshot is a dump of the data and while there is a potential of losing up to one hour of data, it is dramatically faster to recover from a snapshot compared to AOF recovery.

Persistence can be configured either at time of database creation or by editing an existing database’s configuration. While the persistence model can be changed dynamically, just know that it can take time for your database to switch from one persistence model to the other. It will depend on what you are switching from and to, but also the size of your database.

Note: For performance reasons, if you are going to be using AOF, it is highly recommended to make sure replication is enabled for that database as well. When these two features are enabled, persistence will be performed on the database slave and not take away performance wise from the master.

Options for Configuring Data Persistence

There are six options for persistence in Redise Pack:

 Options  Description
None Data is not persisted to disk at all.
Append Only File (AoF) on every write Data is fsynced to disk with every write.
Append Only File (AoF) one second Data is fsynced to disk every second.
Snapshot every 1 hour A snapshot of the database is created every hour.
Snapshot every 6 hours A snapshot of the database is created every 6 hours.
Snapshot every 12 hours A snapshot of the database is created every 12 hours.

First thing you need to do is determine if you even need persistence. Persistence is used to recover from a catastrophic failure, so make sure that you need to incur the overhead of persistence before you select it. If the database is being used as a cache, then you may not need persistence. If you do need persistence, then you need to identify which is the best type for your use case.

Append Only File (AOF) vs Snapshot (RDB)

Now that you know the available options, to assist in making a decision on which option is right for your use case, here is a table about the two:

 AOF Append Only File)  RDB (Snapshot)
More resource intensive Less resource intensive
Provides better durability (recover latest point in time) Less durable
Slower time to recover (Larger files) Faster recovery time
More disk space required (files tend to be grow large and require compaction) Requires less resource (I/O once every several hours and no compaction required)

Data Persistence and Redise Flash

If you are enabling data persistence for databases running on Redise Flash, by default both master and slave shard will be configured to write to disk. This is unlike a standard Redise Pack database where only the slave shards persist to disk. This master and slave dual data persistence with replication is done to better protect the database against node failures. Flash based databases are expected to holder larger datasets and repair times for shards can be longer under node failures. Having dual-persistence provides better protection against failures under these longer repair times.

However, the dual data persistence with replication adds some processor and network overhead, especially in the case of a cloud configurations with persistent storage that is network attached (e.g. EBS-backed volumes in AWS).

There may be times where performance is critical for your use case and you don’t want to risk data persistence adding latency. If that is the case, you can disable data-persistence on the master shards using the following rladmin command:

$ rladmin tune db db: master_persistence disabled