If you’re using Redis you probably already know all about its data persistence mechanisms, namely Append-Only-Files (AOF) and RDB files, but if you need a refresher take a few minutes and read this documentation page and this blog post to get up to speed quickly. Already back? Good. Now, imagine you have multiple RDB files that you want to merge – how would you do that?
RDB files essentially contain a snapshot of your Redis database. They provide a convenient and compact (as they are compressed) way to backup your dataset and schlep it around from one place to another. Redis can be configured to automatically take snapshots (with the save configuration directive) or you can manually trigger their creation with the SAVE/BGSAVE commands. Alternatively, you can create an RDB file from a remote Redis server with the redis-cli –rdb tool and switch. Redis will also load an RDB file upon startup (when configured to do so with the dir and dbfilename directives) and will restore the contents of the database from it.
Redis creates a single RDB file per Redis server – all shared, a.k.a. numbered, databases are included in that file since they are only logical namespaces inside the same databases (BTW you should really avoid using shared databases – here’s why). There are cases, however, when you want to take several such RDBs and merge their contents into a single file. The classic use case for this is when your Redis database is sharded across multiple Redis servers, each managing a subset of the keyspace.
One possible approach to merging Redis databases, using RDB files, is to:
- Export each of the databases’ contents to a text file – You can do that using a custom script or the incredibly useful redis-rdb-tools.
- Process the resulting text files and concatenate them to a single text file.
- Load the single text file back into Redis (again using a script or similar).
While this approach works just fine it is potentially resource intensive, time consuming and cumbersome. What if you could use your existing RDB files and skip the export stage? What if you could have the merged result as an RDB file and have Redis load it efficiently on startup? Well, with rdb-merger you can.
rdb-merger is a small utility that I’ve developed to do just that – it accepts multiple RDB files as its input and merges them into a single output file. Using it is as simple as running the following command:
rdb-merger -o outfile.rdb infile1.rdb infile2.rdb
In the example above, rdb-merger will create an output file called outfile.rdb from two input files: infile1.rdb and infile2.rdb. The full usage for rdb-merger can be obtained by running it with the -h flag:
rdb-merger [-h] [-c redis_conf.conf] [-p] -o output_file.rdb input_file.rdb [input_file.rdb …]
The -o switch is mandatory and specifies the output file. Replace it with a single dash (‘-‘) to direct the output to stdout. Use the -c redis_conf.conf switch to specify the name of a Redis configuration file that will be used during the merger (it can include, for example, different *-ziplist-* or rdbcompression settings to perform a conversion of the input files). Using the -p switch will instruct rdb-merger to print the progress percentage to stderr during its execution.
Important note: rdb-merger does not check for the correctness of the outputted RDB file. That means that if two or more of your input RDBs contain the same key names, the result will not be a valid RDB file and Redis will not be able to load it.
rdb-merger is currently a part of Redis Labs’ Redis fork at GitHub. To use it, you’ll have to download the sources and run make to build Redis. Use it to make the One RDB file to “rule them” all and let me know if you encounter any issues.