Understanding how Redis eviction policies work

Redis is one of the most popular form of cache store for applications. Redis can also be used as a persistent data store with right kind of configuration. In this blog, we go over how these configurations play out in real world scenarios. We also discuss what can happen if such configurations are not carefully considered when using Redis for our application.

To start with, let's setup our environment for the experiment. For this we will use docker to run a simple redis server. Apart from that, you'll need to write some useful scripts to easily check the status of our cache store. I'll be using Ruby here, but Python works just as well!

Let's start a redis container on docker.

$ docker run -d -p 6379:6379 redis:latest

Redis server

Redis has a maxmemory configuration to define how much memory it's allowed to use from the host system. Let us check what the current configuration is, using redis-cli.

$ redis-cli
127.0.0.1:6379> CONFIG GET maxmemory
1) "maxmemory"
2) "0"

By default, redis keeps the value of maxmemory to 0, and will not manage or restrict the use of memory. However, it will still be constrained by the operating system's available RAM or 3GB in case of a 32bit system. Additionally, redis uses maxmemory-policy for deleting data to manage memory usage. Let us check what the current configuration is, using redis-cli.

127.0.0.1:6379> CONFIG GET maxmemory-policy
1) "maxmemory-policy"
2) "noeviction"

By default, redis keeps the value of maxmemory-policy to noeviction, and will never delete any of the data by its own. This too is a conscious configuration to let people decide how they want to use redis. Like mentioned above, if someone wants their redis to persist data, there is no need to enable eviction.

So what can go wrong?

For applications running on high user base production environment, this configuration if left unattented can cause downtimes on redis. How? As we have asked redis server to not restrict its memory use, and a noeviction policy means the data in redis would keep on growing, at some point, it reaches a critical level. Redis can no longer add new records to it, as it gets limited by the memory available. The system kills redis as it gets choked by lack of memory and our apps begin to crash due to connection refused from redis. Since redis is an in-memory solution, we have to increase the system's RAM, and restart it to get it running again. Even though increasing RAM is a possible solution technically, this often is not a viable solution financially as RAM can cost a dime when we start scaling vertically. So then, there are disk based data store options which one can explore in these cases.

Redis as cache store

Moving on, what if one wishes to use redis as a cache store? What things should we be careful of, to start with?

A cache store as the name suggests, is a solution for short-lived data which is safe to be purged in due course of time. Cache is used in applications that require high speed access to certain set of data on a regular basis. Cache is also used when an application want to store intermediate metadata of ongoing tasks for some hours or even days. This suggests that we can ask redis to remove unwanted or outdated records from its memory. Two benefits:

Less RAM requirement => less cost barrier
Near to zero downtime => now who wouldn't like that?

How should we configure it?

For better understanding of how one should configure redis, we need to look at the options redis provides us. The official doc clearly says what configurations we can use. In short, we need to change the maxmemory and maxmemory-policy values to give full control to the redis server to manage its share of memory.

Let's do some fun experiment to see how each of these configuration makes a difference. We will look at three common eviction policies allkeys-lru, volatile-lru, and volatile-ttl in this blog, but this experiment can be extended to other policies as well. To start with, we need to set the two configurations.

127.0.0.1:6379> CONFIG SET maxmemory 1mb
OK

To mimic a low memory scenario, we will restrict its allowed memory to just 1mb. Redis at this point could probably hold a very limited number of records. Once the memory is full, redis kick starts its probablistic eviction algorithm to determine which keys can be deleted to make space for new records. As a user, we have the luxury to control how these keys are selected for eviction. Neat! Let's try these algorithms out!

Case: allkeys-lru, just give my damn space back!

That's right, allkeys-lru removes least recently used key(s) from the memory without any special consideration.

127.0.0.1:6379> CONFIG SET maxmemory-policy allkeys-lru
OK

Switch to a ruby console to execute some scripts to read and write data to see which keys gets deleted. We can use redis-rb gem to connect to redis.

irb> $redis = Redis.new(url: "redis://localhost:6379")
irb> $redis.set("constant-1", "foo")
# "OK"
irb> $redis.set("constant-2", "bar")
# "OK"
irb> $redis.set("constant-3", "tar")
# "OK"
irb> $redis.get("constant-2")
# "bar"
irb> $redis.get("constant-1")
# "foo"
 
# add many records
irb> (1..500).each { |key| $redis.set("loop-#{key}", SecureRandom.uuid)}
 
# get some of the keys to mimic usage
irb> (3..100).each do |key|
        unless $redis.get("loop-#{key}")
          puts "loop-#{key} evicted"
        else
          puts "loop-#{key} found"
        end
      end
# loop-3 found
# .
# .
# loop-28 found
# .
# .
# loop-100 found
 
# now let's flood redis with more data
irb> (501..1500).each { |key| $redis.set("loop-#{key}", SecureRandom.uuid)}
irb> $redis.get("constant-3")
# nil
irb> $redis.get("loop-1")
# nil
irb> $redis.get("loop-12")
# "ea6ef190-05d4-480c-8e47-6785d80ca4d1"
irb> $redis.get("loop-100")
# "fd8ff9e1-6ee2-4e74-aaff-1043b7c24e67"
irb> $redis.get("loop-242")
# nil
irb> $redis.get("loop-1499")
# "88f17049-0e11-4cbc-8485-e000df21a2c3"

So what is going here?

We initially added three keys, constant-1,2,3 and accessed two of them.
We added 500 records in a loop, which consumes considerable amount of memory.
We accessed a subset of those keys to mimic a recently used key. ie. we access keys from 3 to 100. All the keys are still availble in redis.
We begin to flood redis with 1000 records again, upon which redis hits the memory limit.
Redis pick keys based on LRU policy and removes them.
We randomly check some keys to confirm that the keys we queried for in step 3 are still available in redis, but most of the keys from step 2 and 4 are already evicted.
We can also note that the keys inserted recently, eg. loop-1499 is still retained

In this case, its clear that the redis only cares about keys which were used recently and removes all other keys without any further consideration.

127.0.0.1:6379> FLUSHDB
OK

Case: volatile-lru, expire or get deleted!

Unlike allkeys-lru, volatile-lru removes least recently used key(s) from the memory with a special consideration: the key must have an expiry time set.

127.0.0.1:6379> CONFIG SET maxmemory-policy volatile-lru
OK

Let's switch to ruby console once more and test this case. The only difference from the last case is that, when we add new data, we will also set an expiry time for each record.

# set some records without any expiry first
irb> $redis.set("constant-1", "foo")
# "OK"
irb> $redis.set("constant-2", "bar")
# "OK"
irb> $redis.set("constant-3", "tar")
# "OK"
irb> $redis.get("constant-1")
# "foo"
 
# add many records
# instead of using multi, we can also pass expiry as $redis.set(key, val, ex: 2345)
# but this is not supported by HSET
irb> (1..500).each do |key|
        $redis.multi do |multi|
          multi.set("loop-#{key}", SecureRandom.uuid)}
          multi.expire("loop-#{key}", key*20)
        end
      end
 
# get some of the keys to mimic usage
irb> (3..100).each do |key|
        unless $redis.get("loop-#{key}")
          puts "loop-#{key} evicted"
        else
          puts "loop-#{key} found"
        end
      end
# loop-3 found
# .
# .
# loop-38 found
# .
# .
# loop-100 found
 
# now let's flood redis with more data
irb> (501..1500).each do |key|
        $redis.multi do |multi|
          multi.set("loop-#{key}", SecureRandom.uuid)}
          multi.expire("loop-#{key}", key*20)
        end
      end
irb> $redis.get("constant-1")
# "foo"
irb> $redis.get("constant-2")
# "bar"
irb> $redis.get("constant-3")
# "tar"
irb> $redis.get("loop-1")
# nil
irb> $redis.get("loop-100")
# "ef32bffc-6b24-4f0d-b2ba-01ef72e94537"
irb> $redis.get("loop-101")
# nil
irb> $redis.get("loop-1499")
# "55f1c049-3e14-4cbc-8485-e000df21a2c3"
# .
# .
# after 1499*20(=29980) seconds
irb> $redis.get("loop-1499")
# nil

So what is going here?

We initially added three keys, constant-1,2,3 and accessed just one of them.
We added 500 records with expiry in a loop, which consumes considerable amount of memory.
We accessed a subset of those keys to mimic a recently used key. ie. we access keys from 3 to 100.
We begin to flood redis with 1000 records again, upon which redis hits the memory limit.
Redis pick keys with expiry based on LRU policy and removes them.
We randomly check some keys to confirm that the keys we queried for in step 3 are still available in redis, all of the keys from step 2 and most of them from step 4 are already evicted either because they expired or they were forcefully removed.
We can also note that the keys inserted in step 1, eg. constant-1,2,3 are still retained. These keys do not have an expiry and hence the algorithm ignores them!

So in this case, things have taken a little bit of a deviation but we are still removing keys based on LRU technique. Let's see how the third case adds one more condition to this.

127.0.0.1:6379> FLUSHDB
OK

Case: volatile-ttl, squeeze maximum out of life!

We saw how redis removes keys with expiry in the case above. volatile-ttl is almost the same, except that redis will remove the keys that are about to anyhow expire instead of picking the least recently used record. Also, note that redis will ignore keys without any expiry set for records in this case as well.

127.0.0.1:6379> CONFIG SET maxmemory-policy volatile-ttl
OK

Let's check this out just like before.

# set some records without any expiry first
irb> $redis.set("constant-1", "foo")
# "OK"
irb> $redis.set("constant-2", "bar")
# "OK"
irb> $redis.set("constant-3", "tar")
# "OK"
irb> $redis.get("constant-1")
# "foo"
 
# add many records
irb> (1..500).each do |key|
        $redis.multi do |multi|
          multi.set("loop-#{key}", SecureRandom.uuid)}
          multi.expire("loop-#{key}", key*20)
        end
      end
 
# get some of the keys to mimic usage
irb>(3..100).each do |key|
      unless $redis.get("loop-#{key}")
        puts "loop-#{key} evicted"
      else
        puts "loop-#{key} found"
      end
    end
# loop-3 found
# .
# .
# loop-38 found
# .
# .
# loop-100 found
 
# now let's flood redis with more data
irb> (501..1500).each do |key|
        $redis.multi do |multi|
          multi.set("loop-#{key}", SecureRandom.uuid)}
          multi.expire("loop-#{key}", key*20)
        end
      end
irb> $redis.get("constant-1")
# "foo"
irb> $redis.get("constant-2")
# "bar"
irb> $redis.get("constant-3")
# "tar"
irb> $redis.get("loop-1")
# nil
irb> $redis.get("loop-3")
# nil
irb> $redis.get("loop-100")
# nil
irb> $redis.get("loop-299")
# "a1ee4e2c-f6ef-4e3c-a8c5-eba059c65169"
irb> $redis.get("loop-1499")
# "a41b205d-8ef1-4b0b-ab2a-ab910d818e2f"
# .
# .
# after 1499*20(=29980) seconds
irb> $redis.get("loop-1499")
# nil

Hmm! Looks like we have some considerable changes here.

We initially added three keys, constant-1,2,3 without expiry and accessed just one of them.
We added 500 records with expiry in a loop, which consumes considerable amount of memory.
We accessed a subset of those keys to mimic a recently used key. ie. we access keys from 3 to 100.
We begin to flood redis with 1000 records again, upon which redis hits the memory limit.
Redis pick keys with expiry and removes them in an order.
We randomly check some keys. The interesting thing to see here is that the keys are getting removed in an order. The ones we added first would by now have the least amount of time to live. So, keys that fall below 100 (=2000 seconds), have either expired themselves, or about to expire gets removed by redis. Also, even though we have accessed 3..100 keys, redis does not factor in this condition. There is no question of LRU here. Everyone falls in line to get deleted as long as they have an expiry set!
We can also note that the keys inserted in step 1, eg. constant-1,2,3 are still retained. These keys do not have an expiry and hence the algorithm ignores them!

When/Which to use?

The answer to this question boils down to what we are trying to achieve.

Do you have a task that requires some records for X seconds but others are less important in terms of life span? -> use volatile-ttl with varying TTL for records
Do you have certain keys that needs to be always available in redis, but also require general cleaning up? -> pick volatile-lru
Do you just dump data in redis for a very short period and no longer use it afterwards or are you unsure? -> stick with safer allkeys-lru

Apart from these, there are many more eviction algorithms in redis which can come in handy. Do try them in this similar fashion!