Going Serverless

Ajay Pherwani, Vice President & CTO, Tata Capital

The concept of a public cloud offering infrastructure as a service is common. What really excites me about public clouds (at least the larger ones) is their ability to offer a multitude of services along with the infrastructure. Choosing the right public cloud largely depends on which services one needs, and how these services compare across the different market players.

Within this gamut of services available, the one that has actually provided excitement, and which we have implemented at Tata Capital, is “Serverless”. We are finding practical use-cases for serverless almost on a weekly if not a daily basis. Serverless, on its own, is good; however, the true value of serverless comes in when coupled with other services such as workflow, API gateways, notifications services, storage and many more.

While it is easier to implement serverless being a startup, leveraging serverless within a well-oiled enterprise ecosystem is not as simple. The challenges are in rework, recoding, and ensuring that the risk level does not get compromised. One of the common fallacies is to put a value around the savings in terms of ongoing cost – no need for dedicated servers, even on an opex model; no need to procure for peak loads, and so on. These, to me, are not the drivers to go in for serverless. The drivers are the elegance and simplicity of the solution, the agility with which these solutions roll out, the flexibility in terms of managing versions of APIs, and the ability of having enterprise class scalability on tap. Oh yes, and by the way, the cost of doing this is also significantly lower that what we would have ended up paying! Cost savings are a byproduct of a good serverless implementation, rather than the reason to go in for the same.

Let me take an example of what we do at Tata Capital for the Consumer Durables business. We partner with multiple retail chains who sell products of multiple manufacturers. The deals for these chains and manufacturers could vary across cities, stores and also across dates. All this information is compiled by business in Excel sheets, where the same model across sheets could have an extra space, and so on. Some amount of de-duplication is needed even in the entered data. Our earlier process was to use Excel macros (VBA) with Access database on a local laptop, do the validations, and eventually use web services to synchronize the data with the backend RDBMS. The complexity of the validations, keeping track of synchronizations, and so on meant that the entire process would take approx 5 hours, end-to-end. During this 5-hour period, our sales teams at the stores did not have complete information, which would get refreshed incrementally.

We have rewired this entire experience to work under a serverless umbrella on a public cloud, along with its associated services of API gateway, notifications, step functions, cloud storage and a database service. The user uploads the schemes and models onto the cloud storage, which fires up multiple servers in parallel. We have as many servers up and running as are combinations of chains and manufacturers, and should these combinations increase, we would still be within the same 3 to 5-minute window. When all the parallel jobs complete, notifications are sent to the appropriate teams using the notifications service: this is both email and SMS based. The entire solution was designed and completed within 2 weeks.

So what have we established? Split the job using multiple parallel servers that run for a few minutes only. This is pure serverless triggered by a storage “PUT” event. Once this data is in the cloud database (also run by a database service), we need the executives at the stores to access this. For this we use the API gateway service, which is completely RESTful, and has the basic elements of security, metering and throttling. So we have a simple solution, developed in an agile manner, using services that are guaranteed to have high availability and throughput, executing in a controlled workflow environment, without compromising on the risk. All this results in a time saving in excess of 95 percent for the hundreds of executives we have at the stores across the country. This is the true business case.

Still, let us not forget the numbers. We do this job 10 times a month, each time we do a test on a UAT environment and a final run on a Production environment. So, a total of 20 runs per month, with each run not exceeding 3-5 Rupees. My overall cost per month is not more than 100 Rupees (less than 2 US$). As I wrote at the start, cost savings are a byproduct rather than the reason for undertaking the transformation to a serverless solution.