The problem

So let's create a hypothetical.

A very common scenario for any service, is that there is the requirement to access data that is stored is some sort of database hundreds of times per minute. Our service of choice for this blog will be a game service like Steam, Uplay or Origin. These services are all game libraries where people can find hundreds of games in digital form, which they can purchase and download in order to play.

These services are really webapps, that serve some data back to the desktop client.

Now, even though I don't have inside information about Steam's infrastructure, based on the load they are able to handle, I can tell you that most of this data is not coming straight from a database but though some sort of distributed in-memory cache like Redis. This is because it wouldn't be efficient or fast for them to hit the database over and over again to server data that changes infrequently.

How infrequently?
Tens and even hundreds of games are being added into Steam every day and on top of that the existing games go though price changes or get discounted for a short period of time.

The hypothetical we just talked about is a very common scenario for any type of application that stored and serves data back. What I wanted to show in this blog is how we can load and synchronised our persistent storage database with an Azure Redis Cache.

This question can have multiple answers, but the solution I choose to go with is, Azure Functions.

CosmosDB Trigger

Azure functions support multiple types of trigger-based operations. One of them is the CosmosDB trigger. The CosmosDB trigger is (well...) triggered, whenever a document is added or changed.

For our hypothetical system, this means that whenever a game is published or is changed in some way we will get the new or updated version of the Game document in our CosmosDB Azure Function trigger method.

The out of the box code is pretty easy to understand.

The Document object is an object coming from CosmosDB which represents the new or updated document in it's entirety. What's really cool is that if a document is updated multiple times between two trigger calls, you will only get the latest version of this document in the trigger, which means that you always deal with the latest, up to date version of a document.

The LeaseCollectionName property in the CosmosDBTrigger attribute is the name of the CosmosDB collection that the function will require in order to keep track of how many changes have occurred and where it should start reading from in case of a failure. You will need to create this collection yourself. It doesn't need to be partitioned and it can be provisioned with the minimum 400 RU/s. You don't need to do anything manual with this collection other than creating. The Azure function will do the rest for you. The same lease collection can host multiple Azure functions. All you need to do is to add a LeaseCollectionPrefix in your CosmosDBTrigger. That way, the Azure function will be able to identify which lease collection documents it owns.

The CosmosConnectionString property in the CosmosDBTrigger attribute is the name of the property which contains the CosmosDB connection string in our settings.json file.

Now that we have the trigger set up, lets move on to the next thing. Saving it in Azure Redis cache.

Adding in Azure Redis cache

Adding in Azure Redis cache is really simple but what I want to emphasise on is how simple the whole process is.

You are getting all the added or changed documents in a time window, and you don't need to check anything. You can simply call the ToString() method at the document level which returns the string representation of the json document and simply do a SetString in order to upsert the value. No need to check if it exists in the Redis cache or not.

Something worth saying is that obviously in a real life scenario you wouldn't have the Redis connection string in there but as an environment variable or in the settings or even better in KeyVault, but for the sake of simplicity I will leave it there for now.

Now if we upload the Azure Function and add the following document in CosmosDB...

...then in a matter of seconds we can check the Redis cache and...

...it's there!

Now it's up to the consuming webapp to call the Azure Redis cache, retrieve the object and de-serialise it however it wants.

Bonus round - Loading the cache from scratch

Wait wait! What if for some reason, your cache gets corrupted? How can you load everything back into the cache?? So many questions!

Well the answer is actually pretty simple.

You can simply delete the documents for this Azure Function from the leases collection and add the StartFromBeginning = true property at the CosmosDBTrigger attribute of the function. That way, the documents will be recreated with a null continuation token which will force the process to start from the beginning.

(I know there are other ways, but I want to cover them on a Change Feed processor blog)