Watch all RedisConf 2021 sessions on demand
When confronted with system failures, we have tools to help us recover when either
snapshotting or append-only file logging had been enabled. Redis includes two command-
line applications for testing the status of a snapshot and an append-only file.
These commands are redis-check-aof and redis-check-dump. If we run either command
without arguments, we’ll see the basic help that’s provided:
If we provide –fix as an argument to redis-check-aof, the command will fix the
file. Its method to fix an append-only file is simple: it scans through the provided AOF,
looking for an incomplete or incorrect command. Upon finding the first bad command,
it trims the file to just before that command would’ve been executed. For most
situations, this will discard the last partial write command.
Unfortunately, there’s no currently supported method of repairing a corrupted
snapshot. Though there’s the potential to discover where the first error had occurred,
because the snapshot itself is compressed, an error partway through the dump has the
potential to make the remaining parts of the snapshot unreadable. It’s for these reasons
that I’d generally recommend keeping multiple backups of important snapshots,
and calculating the SHA1 or SHA256 hashes to verify content during restoration.
(Modern Linux and Unix platforms will have available sha1sum and sha256sum command-
line applications for generating and verifying these hashes.)
CHECKSUMS AND HASHESRedis versions including 2.6 and later include a
CRC64 checksum of the snapshot as part of the snapshot. The use of a CRCfamily
checksum is useful to discover errors that are typical in some types of
network transfers or disk corruption. The SHA family of cryptographic hashes
is much better suited for discovering arbitrary errors. To the point, if we calculated
the CRC64 of a file, then flipped any number of bits inside the file, we
could later flip a subset of the last 64 bits of the file to produce the original
checksum. There’s no currently known method for doing the same thing with
SHA1 or SHA256.
After we’ve verified that our backups are what we had saved before, and we’ve corrected
the last write to AOF as necessary, we may need to replace a Redis server.