What is Serverless Architecture?
Let’s break this work into two parts – “Server” – “Less”.
Does this give you a notion that we are going to use less Servers? Maybe… Maybe not..
⚠ SPOILER ALERT – Yes! There are Servers in Serverless Architecture
But! to get to that point we need to understand & tackle the first part – “Server”
What is a Server?
If we get to the very core of it, a Server is basically a Computer.
And like Computers, this “Server” has the following:
- One or more CPU(s)
- Some Memory (RAM)
- Some Storage
- Uninterrupted Power Supply
- An OS
- Required software or runtimes to run your Application
That’s it. That’s what a Server is. A Computer essentially. And this Server can be on-premise or in Cloud via some Cloud Vendor such as AWS, Azure, GCP etc.
This Server can run your Application which will “serve” some incoming requests.
Some of you are already aware of it. And everyone has been using them for decades now.
Then what is Serverless? Why the need of Serverless?
Let’s take an example.
A Simple Metrics App
Say we want to build a Web Application – where users can visit and see a table – containing metrics about the different hardware resources in our infrastructure and their capacity, status etc.
For the sake of this example, we chose to develop our application in Node.JS.
For every visit to our Web App – at backend – we must query a dataset stored in a database. And for each incoming request – let’s say – we query the database and apply some transformation logic and respond with final dataset to the frontend. All this operation – say – takes ~2 to 5 seconds.
Being an erudite developer, you put all your efforts to develop this beautiful yet simple application and now it’s time to deploy it. And you chose AWS as your Cloud Vendor.
Traditional Server Architecture & Deployment
For deployment you need to plan & choose different parameters for your Server – just as discussed before – below are few
- Amount and type of CPU(s)
- Amount of Memory (RAM)
- OS etc.
You chose the minimum spec machine that you can select. As it’s our first app.
Once your Server (or Computer) with above parameters is up and running, you remotely connect to it. Install all the necessary software packages & runtimes (in our case Node.JS) required to run your application. Then you upload your application’s code / build artifact and finally deploy!
Now, your application is up and running. And people can visit your Web App and see the metrics.
The above diagram represents this simple architecture.
Your application is running fine, and it can – say – handle a load of 10 concurrent requests per seconds.
One day, there was a pandemic, and everyone was put under lockdown. This increased the traffic on your Application, as more people were working remotely and constantly checking the infra status. This increased load from 10 Requests per second to 15 Requests per second. But your Server, based on the parameters your chose, could only serve –
10 Requests per second. What do you do?
There are many things you can do. But one of the primary actions you can take is to apply some Scaling!
Scaling & Load Balancing
You configured – “Auto Scaling” strategy which tells AWS to create more replica of your initial Server if the traffic increases so that the increased number of requests can be served.
Since now there are multiple Servers – you added “Load Balancer” which will take care of managing the incoming request and distributing it evenly among the servers.
And similarly, when the traffic decreases, Auto Scaling will – decommission the extra Servers that were spun-up. And we’ll be left with initial capacity.
Above diagram shows the Scaled Architecture- which can handle 15 Requests / Second
Note – The Unit of Scale here is a Server. There can be either 1 or 2 or more servers. In the above example – with 2 Servers, your actual capacity is 20 Requests / Second. (10 for each server). But even if the load increases from 10 to 13 Request / Second – a second Server will spun-up. Whose full capacity we are still not using.
After few months, the pandemic was over. And in subsequent months as people resumed their normal lives, the traffic on your application reduced significantly. Hardly 1 request per minute!
You had already chosen the minimum spec machine you can find on AWS and can’t go any lower. And all that computational power of your Server is most of the time being wasted as its sitting idle. Costing you money!
In retrospect, there were two areas which created a lot of burden on us –
- Operational Tasks – Maintaining software, runtimes etc. Security, network configuration etc.
- Capacity Planning – Configuring Scaling -up & down, Load Balancer etc.
What if we can somehow lessen this burden on us.
This is where Serverless Architecture steps in.
In Serverless Architecture, we want to reduce these burdens of Operational Tasks & Capacity Planning. And focus on Development.
Thus, compared to Traditional Server Architecture, here we do not provision a Server, or it associated configuration. Rather, in Serverless Architecture, usually we do the following –
- You choose your preferred Runtime – Node.JS, Python, Go, Java etc.
- Write you code for that preferred runtime – package and upload it.
- Configure the amount of Memory (RAM) your code might need.
Liberation! – Generally, you don’t need to worry much about Operational Tasks & Capacity Planning that much. Your Cloud Vendor does that for you!
Different Cloud Vendor provide & support “Serverless Services” – as SASS in their portfolio.
In AWS, one such service is called “AWS Lambda”.
In AWS Lambda – our business logic / code goes into a Lambda “Function” which is an entity that can be deployed and “Invoked”.
But you might be wondering –
How will this Lambda “Function” tackle scaling? Increase or Decrease in traffic? Load Balancing?
Answer – Your Cloud Vendor manages that for you.
Manages what actually?
Lifecycle of Containers!
What is a Container?
Glad you asked.
A container – when deployed – is an isolated logical environment of your Computer which has some allotted CPU(s), RAM & Storage of your Computer.
A container behaves as if it were an individual computer with its, CPU, RAM, Storage etc.
Thus, a single Computer (or Server) can have many Containers up and running at the same time. Each having its own allotted resources – which you can configure. Running independently. (This is possible because of Linux kernel features such as cgroups, namespaces, seccomp, iptables, chroot etc. )
Among many benefits – one of the benefits is – I can create multiple apps – with different runtimes and software requirements. Package them as Containers – deploy and run them on a same machine.
Each Container has all the necessary runtimes, application code, dependencies packaged within it
Consider the diagram above, notice how different Apps have different version of Java or Node.JS and still run without interference in a Single Machine.
Similarly, in AWS Lambda, when you “Invoke” a Lambda function, AWS behind the scene –
- Creates a Container in some machine in their infrastructure– and allocates a share of resources to this Container (CPU, RAM etc.)
- Downloads & deploys your application code in that Container
- Processes the Input / Request
- Stops the Container when done.
Looking at these steps you might get a hunch that this whole process going to be slow. No, it’s not. Its fast! (Especially with a thing called Container re-use.)
The actual meaning of this buzzword “Serverless” – Managing “less” Operational Tasks for a Server.
It still has a Server
A Simple Dashboard App – Serverless Architecture Design
If we were to design our Dashboard Web App in a Serverless way – the Architecture would look like this –
Compared to Traditional design – There are couple of things to notice –
- Invoke – In Serverless, your Container (AWS Lambda Function) is not up and running beforehand. This is in stark contrast compared to traditional Server Architecture where a Server is provisioned and deployed first.
Each Lambda Function is “Invoked”based on a Trigger Event. And then its deployed and serves your request.
In our case – API Gateway will trigger the Invocation of our Lambda Function.
Once the Lambda Function execution is finished, AWS will manage its reuse or removal lifecycle.
Benefit – You only pay for the time your Lambda function was active. If your Lambda function is not triggered – you will not be billed.
- API-Gateway – Together with AWS Lambda, API Gateway forms the app-facing part of your AWS Serverless infrastructure. Why? Since a Lambda Function does not exists beforehand. It does not have a pre-allotted IP Address. An IP Address is assigned to a Lambda Function during its invocation.
Since our Dashboard Application is a “Web Application”, we need a URL Endpoint.
API Gateway does that for us. It provides us with a URL Endpoint which a User can access.
When a user accesses this URL end-point – API Gateway invokes a Lambda function and returns the response provided by the Lambda function to the user.
- Concurrency – Helps in Scaling. As discussed before, this is one of the benefits of Serverless. By default, you don’t have to configure Scaling configuration for your Lambda Function. AWS does it for you.
As the name suggests, it dictates how many concurrent Lambda functions can run at a given amount of time. (By default, the concurrency in US East Region is 3000. For other Regions it can be 500. it’s configurable as well.)
What happens if the Incoming requests increases?
As mentioned before, Load Balancing and Scaling (Concurrency) by default – is managed by AWS. (You can change it as per your liking.)
If there are 3 Concurrent requests, API Gateway will invoke 3 Lambda functions (Containers). Each of these will process their request, return a response and then stop.
If we get 100 concurrent requests, API Gateway will invoke 100 concurrent Lambda functions and so on. (Until you reach your concurrency limit.)
Thus, in Serverless – AWS Lambda Function is the Unit of Scale. (Where as in traditional Server Architecture, Server Box was the Unit of scale)
Serverless Cost and Billing
In AWS Lambda, you can only allot the amount of Memory (RAM) to your Lambda function. Not CPU(s).This is intentional, based on the amount of memory you allot, AWS proportionally allots CPU(s).
Thus, in AWS you will be billed based on GB-Sec as a unit – Amount of Memory used for a given amount of time.
Let’s say, our Metrics App now has low traffic – 1 Request / Minute. And each request takes ~5 seconds for Lambda Execution. And our Lambda Function requires 128MB of Memory.
Then, Number of Execution per month = 43,200 ( 60 Requests per hour x 24 hours x 30 days )
Which, when this article is written, might cost you only ~ 1$ !
Which when compared to the Traditional Server Architecture with similar number of requests per month – is much cheaper.
Great, should I start moving my applications to Serverless Architecture?
Yes and No.
Like all the different architectures, it has its place. There are certain use-cases where it’s very well suited- e.g. File Processing, Data & Analytics, Mobile Application etc. And saves you a lot of cost. And there are others where it’s not.
I’ll share an overview of one of the use-cases we solved here at Impetus – with AWS Serverless implementations.
ETL via Serverless Architecture
Use-Case – In one of the projects I was working, we were going to receive Data Files for different customers rather sporadically. For one Customer we would get files once every day, for another we might get it once a week and for rest – every third day or so. The dataset for moderate in size. Once we received the data, we wanted to transform it as per the business use-case. The transformation process generally took a minute or so.
Thus, we wanted a Service –
- That would be triggered only when we get the data.
- Process the data and shutdown.
- Bill us only for the execution time whenever it is triggered.
As you may have guessed –
We used Serverless Architecture – as it perfectly suited this use-case and saved us a lot of cost.
Below is the Solution Diagram
Whenever the Customer Uploaded the Data File on S3, it triggered our AWS Lambda function. (Note – Instead of API Gateway, this time AWS S3 is triggering the invocation of our Lambda function)
There is so much more you can do in Serverless Architecture.
e.g. You can even create a Pipeline by combining multiple AWS Lambdas to create a flow. One such service is called AWS Step Functions.
In fact, in our use-case above – we ourselves used AWS Step Functions (rather than a single Lambda function.)
Below is the more accurate representation of our solution. Where AWS S3 triggers a Step Function Pipeline – inside it each Lambda passes its output – as an input – to the next Lambda in Pipeline.
Serverless Architecture is opening new frontiers in technology. It’s a cost-effective solution not just for Cloud Customers but for Cloud Vendors as well. As it allows them to better utilize their infrastructure by allocating Containers on un-used / idle resources.
Hope you enjoyed this article, as much as I did – writing it. Have a nice day 😊
- Lambda Container Reuse – https://aws.amazon.com/blogs/compute/container-reuse-in-lambda/
- Lambda Scaling – https://docs.aws.amazon.com/lambda/latest/dg/invocation-scaling.html
- Lambda Concurrency – https://aws.amazon.com/blogs/compute/managing-aws-lambda-function-concurrency/
- AWS Lambda Price Calculator – https://s3.amazonaws.com/lambda-tools/pricing-calculator.html
- Common Lambda Use-Cases – https://docs.aws.amazon.com/lambda/latest/dg/applications-usecases.html
- AWS Step Functions – https://aws.amazon.com/step-functions/