Distributed locking with Redis

Source: Google Images

Most of us know Redis as an in-memory database, a key-value store in simple terms, along with functionality of ‘ttl’ — time to live for each key. Redis is commonly used as a Cache database. In this article, I am going to show you how we can leverage Redis for locking mechanism, specifically in distributed system. Let’s get redi(s) then ;)

In today’s world, it is rare to see applications operating on a single instance or a single machine or don’t have any shared resources among different application environments. [Most of the developers/teams go with the distributed system solution to solve problems (distributed machine, distributed messaging, distributed databases..etc)] .It is very important to have synchronous access on this shared resource in order to avoid corrupt data/race conditions.

Suppose you are working on a web application which serves millions of requests per day, you will probably need multiple instances of your application (also of course, a load balancer), to serve your customer’s requests efficiently and in a faster way. Suppose there are some resources which need to be shared among these instances, you need to have a synchronous way of handling this resource without any data corruption. So you need to have a locking mechanism for this shared resource, such that this locking mechanism is “distributed” over these instances, so that all the instances work in sync. manner while working on the shared resource. In this case simple locking constructs like -MUTEX,SEMAPHORES,MONITORS will not help as they are bound on one system. We will need a central locking system with which all the instances can interact. Many developers use a standard database locking, and so are we. We are going to use Redis for this case.

Redis, as stated earlier, is simple key value database store with faster execution times, along with a ttl functionality, which will be helpful for us later on. We were talking about sync. of a shared resource among different instances of the applications. What we will be doing is:

  • Create a unique key for the resource
  • Acquire lock on that key using redis
  • Perform our operations
  • Release the lock for that key

Redis provides us a set of commands which helps us in CRUD way. We will define “client” for Redis. A client can be any one of them:

  • Application instance
  • Any thread in the case multi-threaded environment (see Java/JVM)
  • Any other manual query/command from terminal

So whenever a client is going to perform some operation on a resource, it needs to acquire lock on this resource. To acquire lock we will generate a unique corresponding to the resource say — resource-UUID-1 and insert into Redis using following command:

SETNX key value— this states that set the key with some value if it doesn’t EXIST already (NX — Not exist), which returns “OK” if inserted and nothing if couldn’t. So the code for acquiring a lock goes like this:

boolean acquireLock(key, value){
result = execute(SETNX, key, value)
if(result == "OK"){
log("acquired lock by client %s for key %s",value, key)
return true
}else{
log("couldn't acquire, retry after some time")
return false;
}
}

This requires a slight modification. What happens if a client acquires a lock and dies without releasing the lock. Other clients will think that the resource has been locked and they will go in an infinite wait.

This can be handled by specifying a ttl for a key. So while setting a key in Redis, we will provide a ttl for the which states the lifetime of a key. After the ttl is over, the key gets expired automatically. So in this case we will just change the command to SET key value EX 10 NX — set key if not exist with EXpiry of 10seconds. So the resource will be locked for at most 10 seconds.

Okay, locking looks cool and as redis is really fast, it is a very rare case when two clients set the same key and proceed to critical section, i.e sync is not guaranteed. Now once our operation is performed we need to release the key if not expired.

We need to free the lock over the key such that other clients can also perform operations on the resource. A key should be released only by the client which has acquired it(if not expired). To ensure this, before deleting a key we will get this key from redis using GET key command, which returns the value if present or else nothing. We will first check if the value of this key is the current client name, then we can go ahead and delete it. Code for releasing a lock on the key:

void releaseLock(key, client){
value = execute(GET, key)
if(client==value && value!=null)
execute(DEL, key)
else
log("error while releasing lock for key %s",key)
}

This needs to be done because suppose a client takes too much time to process the resource during which the lock in redis expires, and other client acquires the lock on this key. Once the first client has finished processing, it tries to release the lock as it had acquired the lock earlier. If we didn’t had the check of value==client then the lock which was acquired by new client would have been released by the old client, allowing other clients to lock the resource and process simultaneously along with second client, causing race conditions or data corruption, which is undesired. Following is a sample code.

key = generateKeys(){
return 'specialUUID'
}
value = 'client_name' // thread name or app name
if(acquireLock(key,value){
performOperation()// sleep for 2 seconds or network call
releaseLock(key, value)
}

Complexity arises when we have a list of shared of resources. In that case we will be having multiple keys for the multiple resources. One should follow all-or-none policy i.e lock all the resource at the same time, process them, release lock, OR lock none and return. No partial locking should happen.

The above method guarantees:

  • Deadlock free locking — as we are using ttl, which will automatically release the lock after some time
  • Eliminated infinite wait

But still this has a couple of flaws which are very rare and can be handled by the developer:

  • If a client dies after locking, other clients need to for a duration of TTL to acquire the lock — will not cause any harm though.
  • If a client takes too long to process, during which the key expires, other clients can acquire lock and process simultaneously causing race conditions.

Above two issues can be handled by setting an optimal value of TTL, which depends on the type of processing done on that resource. So this was all it on locking using redis.

https://redis.io/

https://redislabs.com/ebook/part-2-core-concepts/chapter-6-application-components-in-redis/6-2-distributed-locking/

Tech | Travel | TV series