Blog
Cost optimized Serverless CI/CD strategy using GitLab
Prashant Singh
September 4, 2020
Back

Motivation

Lowering the IT infra and operation cost, along with reducing the complexity of infra provisioning are few fundamental paradigms, which made serverless architecture a popular solution for many projects these days. Here, we would be discussing the role of serverless architecture in CI/CD in a project and how to choose a cost-optimized managed solution for the same. 

What is serverless architecture?

The word serverless is somewhat misleading. Anyone who is not aware of it would think of serverless as an architecture where no server is involved. Well, that’s not true, servers are there, it’s just that as a developer we don’t have to worry about the complexity of managing and operating the servers by our self.

One of the major advantages of using Serverless is a significant reduction in cost. But we should be mindful of the type of problems for which serverless solutions are best suited. Not every solution can be serverless. For example, it won’t be prudent to train your machine learning model using Lambda functions.

Typically, in serverless architecture, we split the application into multiple microservices. With microservices, it is easier to achieve high availability and scalability, but the trade-off is having many loosely coupled services without ACID-like transaction consistency. For managing the integration and deployment of these microservices, an end-to-end automated CI/CD pipeline is a prerequisite.

What is CI/CD?

Continuous Integration – It is a software development practice where developers integrate code into the shared repository and with every integration, we perform certain operations including running build, unit test, code style check, and other operations depending on the project.

Continuous Delivery – It could be thought of as an automated process of deploying merged changes on the branch to the respective server.

The combination of these two concepts helps developers to focus more on code, implementing new features, deliver value faster and more transparently. 

When choosing the right platform for CI/CD, we have many options. But generally, options fall under one of the two categories –

  1. Managed CI/CD services
  2. Self-hosted services which we need to set up and manage ourselves

These two types of services differ on various parameters, but we would evaluate them mainly on parameters like infrastructure management, costing and extensibility

ParameterManaged CI/CD serviceSelf-hosted CI/CD service
InfrastructureCompletely hosted and supervised by an external organization offering CI/CD services. It is the service provider’s responsibility for running, scaling the services, and maintaining the server health.We are responsible for making infrastructural decisions, keeping the underlying servers healthy by servicing hardware, patching software, ensuring the services are available, secure, and performing adequately.
CostCompetitive pricing but the price might not scale well and could increase considerably as the team size increasesRunning and managing the infrastructure, servers and keeping them healthy requires a dedicate team, also running self-hosted resources adds up the pricing  
ExtensibilityServices might not have support for all the platforms, tools, and environments. Before introducing new technology or language in the project, we must check if those are supported by the service providerHighly extensible, more support for different platforms and languages. Even some services can be customized with plugin/extension to support functionality that is not available by default.
Server AvailabilityShared VM runners as well as dedicated VM runnersDedicated runners
ProviderGitLab CI, AWS CodePipeline, Azure DevOps, GitHub Actions, Atlassian Bamboo CI, Travis CIJenkins, TeamCity

​​​​​​​

Traditional Approach to CI/CD​​​​​​​

Historically, Jenkins has been the automation server of choice for CI/CD in many organizations. While Jenkins is a stable, proven, and extensible solution, it has its own set of limitations.

First of all, you have to maintain your own build servers which keep running 24 x 7 thereby entailing a lot of costs because the application code is obviously not being built throughout the day. Then, you need to have a dedicated team of DevOps engineers to manage your automation pipelines and also look after the stability and performance of the Jenkins servers. While this may work for large and complex projects, it may be overkill for most of the other projects out there.

Jenkins has a large number of redundant and not so often maintained plug-ins and if you decide to use any one of them chances are that you might get stuck with support any time in the future in which case you might end up reconfiguring your pipeline. Also, Jenkins has still not adapted to the modern YAML configuration for pipeline specifications.

From the above points it is clear that in the traditional approach, the bulk of the organizations may not be inclined towards committing so many resources towards managing their build automation process.

Using serverless CI/CD

This is where serverless CI/CD makes a lot more sense. This approach aims to address the shortcomings of the traditional approach.

For the purpose of this article, our area of focus is to optimize the cost of our CI/CD operations using managed serverless solutions. Of these, we shall focus on GitLab CI as it is a promising option, considering its competitive pricing and breadth and depth of the CI offerings. It also supports serverless applications, and ease of setting up pipeline.

GitLab CI offers a fully functional version control system and an integrated CI solution without the need of any other application. We can build a complete CI/CD pipeline solution with just one service. GitLab CI/CD uses GitLab Runner to execute the pipeline. These runners are basically shared virtual machines in the cloud wherein the code that we pushed to the version control system is being built as per the rules specified in the configuration file.

GitLab CI relies on having a “.gitlab-ci.yml” file in the root of your repo. Whenever a commit is made, a CI/CD will execute against that “.gitlab-ci.yml”that is current of the commit. GitLab Runner just needs “.gitlab-ci.yml” to execute the pipeline. 

Moreover, your project deployment might require additional values to be injected externally for the sake of security. For such cases, there is support for Variables which essentially are key-value pairs that can be referenced during deployment.

“.gitlab-ci.yml” file looks like this. We can add more stages to it depending upon project requirements.

​​​​​​​The Use Case

We will go ahead and demonstrate a use case where we used GitLab CI to automate the deployment process for one of our clients who happen to be a leading eCommerce company in the print-and-upload sector. Their application was based on a microservices architecture wherein each of the micro-service had its own deployment specifications managed separately in the version control system. And the application was deployed on AWS.

Each of the micro-services code resides in a separate repository. Hence, “.gitlab-ci.yml” was defined for each of them. Moreover, the application supported multiple environments (dev, stage, uat and prod). Thus, each of these specifications were defined in the configuration file.

GitLab CI/CD needs to interact with AWS account to which deployment is to be done, for that it requires credentials in the form of AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY. These are sensitive details that cannot be shared anywhere in the code, hence we use variables to inject such values in the pipeline.

So, the overall setup resembles something like this​​​​​​​

In this article, we are evaluating GitLab CI and hence are focusing on the configuration aspects of the same. However, competing services like GitHub Actions, Bamboo CI, Travis CI, Azure DevOps, etc. have a similar approach towards CI/CD. They require a standard CI configuration file to be defined. This file contains rules for the pipeline including supporting multi-step deployments. 

But where’s the cost-saving?

Looking at the above solution, you might be asking yourself, okay I get it, I can use GitLab CI for setting up my continuous integration and continuous deployment pipeline. But where exactly is the cost-saving?

GitLab CI (and many other providers listed above) offer monthly free limits in terms of build minutes. Most of the providers have got an unlimited free limit for public projects while providing a generous limit for private projects. The duration for which the application build runs is counted against your quota. When you exhaust your provided limit, you can easily top it up with additional minutes. This alleviates the need to provision dedicated build servers.

Your application is being built on shared VM runners which are typically Linux machines running in the cloud. However, some of been provided also offer the option of Mac machines as well. So basically, you are getting a build environment of your choice thereby investing in expensive hardware for specific requirements.

And to top it all, you don’t need to have a dedicated DevOps team to manage your CI/CD automation. Your developers can themselves manage the end delivery of the product by configuring it accordingly.

Hope you now see the cost savings

Conclus​​​​​​​ion

CI/CD is an integral part of software delivery for modern applications. And managed CI/CD solutions offer a serverless approach to managing your pipelines thereby reducing costs and operational overheads. They may not cater to every use case, but they do fit in a majority of them, especially for cloud-based modern frontend and backend applications. Even better, their capabilities are only getting better with time. So, the next time you think of a CI/CD strategy for your organization, you can certainly give these solutions a serious thought.