4.6 Performance considerations
When coming from a relational database background, most users will be so happy with improving performance by a factor of 100 times or more by adding Redis, they won’t realize that they can make Redis perform even better. In the previous section, we introduced non-transactional pipelines as a way to minimize the number of round trips between our application and Redis. But what if we’ve already built an application, and we know that it could perform better? How do we find ways to improve performance?
Improving performance in Redis requires having an understanding of what to expect in terms of performance for the types of commands that we’re sending to Redis. To get a better idea of what to expect from Redis, we’ll quickly run a benchmark that’s included with Redis, redis-benchmark, as can be seen in listing 4.10. Feel free to explore redis-benchmark on your own to discover the performance characteristics of your server and of Redis.
The output of redis-benchmark shows a group of commands that are typically used in Redis, as well as the number of commands of that type that can be run in a single second. A standard run of this benchmark without any options will try to push Redis to its limit using 50 clients, but it’s a lot easier to compare performance of a single benchmark client against one copy of our own client, rather than many.
When looking at the output of redis-benchmark, we must be careful not to try to directly compare its output with how quickly our application performs. This is because redis-benchmark doesn’t actually process the result of the commands that it performs, which means that the results of some responses that require substantial parsing overhead aren’t taken into account. Generally, compared to redis-benchmark running with a single client, we can expect the Python Redis client to perform at roughly 50–60% of what redis-benchmark will tell us for a single client and for nonpipelined commands, depending on the complexity of the command to call.
If you find that your commands are running at about half of what you’d expect given redis-benchmark (about 25–30% of what redis-benchmark reports), or if you get errors reporting “Cannot assign requested address,” you may be accidentally creating a new connection for every command.
I’ve listed some performance numbers relative to a single redis-benchmark client using the Python client, and have described some of the most likely causes of slowdowns and/or errors in table 4.5.
This list of possible performance issues and solutions is short, but these issues amount to easily 95% of the performance-related problems that users report on a regular basis (aside from using Redis data structures incorrectly). If we’re experiencing slowdowns that we’re having difficulty in diagnosing, and we know it isn’t one of the problems listed in table 4.5, we should request help by one of the ways described in section 1.4.
|Performance or error||Likely cause||Remedy|
|50–60% of redis-benchmark for a single client||Expected performance without pipelining||N/A|
|25–30% of redis-benchmark for a single client||Connecting for every command/group of commands||Reuse your Redis connections|
|Client error: “Cannot assign requested address”||Connecting for every command/group of commands||Reuse your Redis connections|
Most client libraries that access Redis offer some level of connection pooling built in. For Python, we only need to create a single redis.Redis() for every unique Redis server we need to connect to (we need to create a new connection for each numbered database we’re using). The redis.Redis() object itself will handle creating connections as necessary, reusing existing connections, and discarding timed-out connections. As written, the Python client connection pooling is both thread safe and fork() safe.