Database Persistence with Redise Pack
As Redise Pack is not just a caching solution, but also a full fledged database, it supports persisting your data to disk on a per database basis and in multiple ways. Redise Pack two options for persistence, Append Only File (AOF) and Snapshot (RDB).
Data persistence, via AOF or snapshots, is used solely to restore the database if it fails. This is necessary as Redis is an in-memory datastore and when the process stops or the server crashes, everything in RAM is lost. Data persistence is optional and can be set to none if you so desire.
AOF writes the latest ‘write’ commands into a file every second. As a comparison, AOF resembles a traditional RDBMS’s redo log, if you are familiar with those. This file can be ‘replayed’ in order to recover from a crash.
A snapshot (RDB) on the hand, is performed every one, six, or twelve hours. The snapshot is a dump of the data and while there is a potential of losing up to one hour of data, it is dramatically faster to recover from a snapshot compared to AOF recovery.
Persistence can be configured either at time of database creation or by editing an existing database’s configuration. While the persistence model can be changed dynamically, just know that it can take time for your database to switch from one persistence model to the other. It will depend on what you are switching from and to, but also the size of your database.
Note: For performance reasons, if you are going to be using AOF, it is highly recommended to make sure replication is enabled for that database as well. When these two features are enabled, persistence will be performed on the database slave and not take away performance wise from the master.
Options for Configuring Data Persistence
There are six options for persistence in Redise Pack:
|None||Data is not persisted to disk at all.|
|Append Only File (AoF) on every write||Data is fsynced to disk with every write.|
|Append Only File (AoF) one second||Data is fsynced to disk every second.|
|Snapshot every 1 hour||A snapshot of the database is created every hour.|
|Snapshot every 6 hours||A snapshot of the database is created every 6 hours.|
|Snapshot every 12 hours||A snapshot of the database is created every 12 hours.|
First thing you need to do is determine if you even need persistence. Persistence is used to recover from a catastrophic failure, so make sure that you need to incur the overhead of persistence before you select it. If the database is being used as a cache, then you may not need persistence. If you do need persistence, then you need to identify which is the best type for your use case.
Append Only File (AOF) vs Snapshot (RDB)
Now that you know the available options, to assist in making a decision on which option is right for your use case, here is a table about the two:
|AOF Append Only File)||RDB (Snapshot)|
|More resource intensive||Less resource intensive|
|Provides better durability (recover latest point in time)||Less durable|
|Slower time to recover (Larger files)||Faster recovery time|
|More disk space required (files tend to be grow large and require compaction)||Requires less resource (I/O once every several hours and no compaction required)|
Data Persistence and Redise Flash
If you are enabling data persistence for databases running on Redise Flash, by default each master and slave shard will be configured to write to disk. This is unlike a standard Redise Pack database where only the slave shards persist to disk. This master and slave dual replication is done for two reasons.
- Typical Redise Flash enabled databases are larger in size as compared to standard Redise Pack databases.
- Replication can, under certain circumstances, take longer when writing to flash storage.
This dual replication adds additional processing overhead and additional network overhead especially in the case of a cloud configuration where the persistent storage is network attached (e.g. EBS backed volumes in AWS).
There may be times where performance is critical for your use case and you are more partial to it over the chance persistence is slower. If that is the case, you can disable data-persistence on the master shards, use the following
$ rladmin tune db db: master_persistence disabled