How Serverless Tech Slashed Our AWS Bills by 90%
The web development world is one of if not the most rapidly changing areas of tech. It is nearly impossible to keep up with all the new technologies so it is truly critical to be able to filter out fake trends from real trends or else you will be refactoring you application every week. Perhaps ones of the largest trends recently are serverless technology and especially edge compute with edge run times have been all the hype with companies like Vercel playing the quarterback. With my work at IPS working on our trading software QuantStop we saw the benefits that serverless could provide for us and we made the switch to a serverless architecture that reduced our AWS expensive by more than 90%.
Understanding serverless
Before I dive into the weeds of how I did this lets first discuss what serverless technology is. Serverless technology allow you to run code on demand without having a dedicated / shared server. This results in you are paying for only the compute you use and allowing the ability to scale nearly infinitely due to the dynamic autoscaling nature of serverless. This allows to no longer need a team of engineers just managing the orchestration of your containers. This is sounds like a great solution and it can be for a lot of cases however one area where serverless struggles is in cold starts which are typically caused when your instance has not been invoked for a period and needs to fully boot up again to process the next incoming request. This is where the new Edge run times are great (future article coming about this) but they are still very new and a lot of packages do not support edge run time yet. Another area that can cause some issues is database connections as databases can only handle so many concurrent connections, but chances are unless you have some serious traffic coming to your website 95% of application do not need to worry about this but it but does create a complication of needing to pool your database connections if you're planning some truly enterprise level concurrency.
What makes an application a good fit for serverless?
What makes an application a great fit for serverless is if your website traffic comes in spikes or certain routes in your application take up a lot of compute power (check out Fly.io's FLAME they are doing some cool things over there to rethink when we use serverless). If your website has a constant and predictable hum of traffic where each route is pretty light in compute you will probably end up spending more if you use a serverless technology and also will decrease user experience with the occasional cold starts (as in all reality you can never truly prevent cold starts). It is also useful if you are a hobbyist and do not want to pay for a dedicated server or want to manage a server. You can use serverless to host your website for pennies a month and not have to worry about the maintenance of a server. This is how companies like Vercel are able to provide such a great service for free for hobbyists. QuantStop was a perfect fit because it is a tool where 95%+ of usage and compute was during market hours as it is a trading software. This meant that only ~32.5 hours a week we were paying for the compute we needed. This was realized and adjusted for by using a special type of EC2 instance which was burst optimized. However, this adjustment while economical did not nearly reduce the expenses nearly as much as serverless did and did not solve the issue of being able to infinitely scale without having to create complex container orchestrations.
The solution
Our website required a tailored solution beyond the standard automated deployment methods often used. Our backend, developed in Python, was essential for handling large data intensive tasks and complex mathematical operations. We also implemented a quasi-distributed system, running separate programs that assisted traders in finding optimal trading opportunities and signals on entry and exit points. This unique setup led us to a custom AWS architecture, utilizing
- Lambda
- CloudFront
- S3
- ECR
- API Gateway
The DNS was configured to direct to CloudFront, which either delivered static files or connected to the API Gateway, triggering our Docker-contained lambdas.

For our scheduled batch processes for our data engineering pipelines, we employed containerized Lambda functions triggered on a scheduled basis through CloudWatch events.
Managing infrastructure becomes critical when your application extends beyond a basic frontend and server setup, particularly when aiming for consistent staging environments that mirror production. Terraform which is a infracture as code (IaC) tool was our solution to this challenge. With over 50 AWS resources in play, not using Terraform or a similar tool is a recipe for errors and makes modifications extremely difficult.
Conclusion
After all was said and done, we established an CI/CD workflow that efficiently staged our code and seamlessly promoted it to production while having 0 down time and leveraging serverless technologies. This approach resulted in a dramatic 90% reduction in AWS costs and significantly enhanced website performance during peak usage periods.