Goodbye Cache: Redis as a Primary Database

Avatar by Kyle Davis

For many applications there exists a database and a cache. Each time data is needed from the database, the application has to do a short choreographed dance:

  • Ask the cache for the data.
  • If the cache doesn’t have the data, ask for the data from the database.
  • Take that data from the database and populate it into the cache.
  • Get a reply from the cache. 

Slow databases sow complexity

This dance adds significant complications in terms of application logic, connections between multiple services, and multiple potential points of failure. 

Why is this dance necessary? Simply put, to make up for the database being slow. 

Do you always need to have a slow database, a fast cache, and an application that is coordinating between the two? Sure, sometimes you do need a big, complicated, slow database to undergird a big, complicated data model. But very often, this kind of database architecture is simply superfluous. And if you can manage to avoid using a separate, slow database, you can have a single data layer connected to your application, which presents many fewer opportunities for slowdowns and failures. To see how it works, let’s take a look at building an application using a single data layer with Redis:

The primary reason to build an application without multiple data layers is to reduce complexity. Operationally, any time you can remove an entire layer from your stack, things are vastly simplified. You no longer have to worry about the database and cache both staying up, and you also don’t need to worry about two sets of upgrades, security protocols, resources, or underlying infrastructures. 

Scaling both a cache and a database is often complicated; each data layer scales differently, reaching infrastructure and optimization opportunities at different times. Additionally, reducing the number of moving parts reduces latency—even though any given piece of the architecture may be fast, each item adds some sort of latency, either through the database itself or the connections made between items. Going to a single data store eliminates several internal network traversals. Finally, developing applications with a single data store requires only a single programmatic interface, so developers need only understand the intricacies of a single database rather than a database and a cache. That reduces the mental cost of context switching during development.

Choosing the right database

If software is an exercise in engineering, part of building great software is choosing the right database. Good engineering is not only about selecting things capable of supporting the design, it’s also about picking things that are not overkill (think of this as an application of the Rule of Least Power). Borrowing from civil engineering, why build a bridge using 9,000 tons of concrete when 100 tons will get the job done? 

Many applications have relatively simple data needs, which can easily be supported via Redis’ built in data structures. Other applications may need a bit more—for them Redis provides an extensible engine that allows modules to add just the capabilities needed, and no more. This approach extends to durability: Redis gives you the option of being entirely ephemeral, achieving durability through periodic snapshotting, or going all the way up to on-write durability with Append-only File (AOF). Redis can make the optimal trade-off between performance and durability depending on your use case.

Redis and the power of a strong community

Training a developer on new technology is extremely expensive. It’s usually preferable to find someone (either inside or outside the organization) who already knows the technology. 

Redis has been voted Most Loved Database three years in a row on StackOverflow, and more than 2 billion Redis Docker containers have been launched. This means it shouldn’t be hard to find Redis expertise. And when Redis developers get stuck, there are literally thousands of resources—books, tutorials, blog posts, and more—to help resolve the issues. There are hundreds of Redis client libraries covering every major programming language and even some obscure ones. In many languages, developers can choose from a variety of libraries to get just the right style and abstraction level. Redis is a database for a range of data sizes, from a few megabytes into the hundreds of terabytes.

Redis and the power of simplicity

One reason Redis is so widely used is that it’s simple to understand deeply. Redis gives developers building applications the tools and information they need in a very efficient manner. In the Redis documentation, computational complexity in Big O notation has a place of prominence for every command. Other databases often act as black boxes, either because they’re closed source or because the open source is too massive and complex to be fully understood by anyone not directly involved. 

BSD licensed and relatively compact, Redis is often cited as an example of a clean, well-organized C codebase. If something doesn’t make sense, it’s easy to find out and understand the absolute truth of what the database is doing. Nothing Redis does is magic—it’s just using long-established, efficient patterns to implement fundamental data structures.

Scaling: the price of success

When building an application it is important to think about a particularly good problem: What happens when your application is successful? A successful application has to scale in a couple of ways: You have to scale up the throughput and scale out the data size. 

Redis is synonymous with in-memory data storage—using DRAM is the fastest practical method of storing data in computing, so this doesn’t require further optimization. Redis Enterprise is designed to scale in a near-linear fashion (94% of linearity) and has demonstrated the ability to achieve 200 million ops/sec on a minimal number of nodes. In short, adding more hardware to your cluster gives you additional throughput without hitting plateaus or walls. 

Scaling out can also be achieved by adding more hardware, but to be more efficient with your infrastructure budget, you can use Redis on Flash, which lets you address dramatically larger datasets at a lower cost with minimal impact on throughput. Redis on Flash automatically extends DRAM storage into SSDs, keeping the hot values in memory while moving the cooler values into on-instance Flash memory. (More on Redis on Flash and tiered memory in a moment.)

Redis Enterprise also enables scaling across the globe with Active-Active deployments. This means a single data set can be replicated to many clusters spread across wide geographic areas, with each cluster remaining fully able to accept reads and writes. Redis Enterprise uses Conflict Free Replicated Data Types (CRDTs) to resolve any conflicts automatically at the database level and without data loss. Spreading clusters widely keeps data available at a geo-local latency and adds resiliency to cope with even catastrophic infrastructure failures. 

The future is byte-addressable

Having a future-proof database helps ensure that it won’t rapidly become a legacy hindrance to your applications. 

Redis is built around the concept of using dynamic random-access memory (DRAM) as byte-addressable storage. Spinning disk hard drives were first introduced in 1956 and conceptually remain the same to this day. Many databases assume they are running on spinning disks and optimize for such things as data co-location to reduce rotational lag and even incorporate specialized formatting to place indexes on particular parts of the platter. 

On the other hand, while most SSDs are based on solid-state NAND technology, they are accessed through hardware and software interfaces built for spinning disks that are not byte addressable nor random access (if you are truly interested in the underlying technology at the hardware level, Ars Technica has a deep dive). The advent of technologies such as Intel’s Optane DC Persistent Memory, an SSD technology that is byte addressable and random access, further blurs the line between system memory and bulk storage. In five years no one will still be thinking about rotational lag and platter optimizations—the future is byte addressable and Redis is built from the ground up to use byte-addressable storage.

Database performance is forever

Expectations for performance are not going away. You’ll never hear a business leader say, “I wish our database was slower.” Thinking about building modern applications involves making them real-time, easy to develop, operationally elegant, scalable, and future-proof. 

Sure, Redis makes a great database cache, but expanding Redis’ role caching to a primary database gives developers a head start on building the applications of tomorrow. 

For more information on how Redis Labs customers are already using Redis as a primary database, check out these case studies: