DEV Community

Brent Mitchell
Brent Mitchell

Posted on

After 5 years, I'm out of the serverless compute cult

I have been using serverless computing and storage for nearly five years and I'm finally tired of it. I do feel like it has become a cult. In a cult, brainwashing is done so gradually, people have no idea it is going on. I feel like this has happened across the board with so many developers; many don’t even realize they are clouded. In my case, I took the serverless marketing and hype hook, line, and sinker for the first half of my serverless journey. After working with several companies small and large, I have been continually disappointed as our projects grew. The fact is, serverless technology is amazingly simple to start, but becomes a bear as projects and teams accelerate. A serverless project typically includes a fully serverless stack which can include (using a non-exhaustive list of AWS services):

  1. API Gateway
  2. Cognito
  3. Lambda
  4. DynamoDB
  5. DAX
  6. SQS/SNS/EventBridge

Combining all of these into a serverless project become a huge nightmare for the following reasons.


All these solutions are proprietary to AWS. Sure, a lambda function is a pretty simple idea; it is simply a function that executes your code. The other services listed above have almost no other easy and testable solutions when integrated together. Serverless Application Model and Localstack have done some amazing work attempting to emulate these services. However, they usually only only cover basic use cases and an engineer ends up spending a chunk of time trying to mock or figure out a way to get their function to test locally. Or, they simply forget it and deploy it. Also, since these functions typically depend on other developer's functions or API Gateway, there tends to be 10 different ways to authorize their function. For example, someone might have an unauthorized API, one may use AWS credentials, another might use Cognito, and yet another uses an API key. All of these factors lead to an engineer having little to no confidence in their ability to test anything locally.

Account Chaos

Since engineers typically don't have a high confidence in their code locally they depend on testing their functions by deploying. This means possibly breaking their own code. As you can imagine, this breaks everyone else deploying and testing any code which relies on the now broken function. While there are a few solutions to this scenario, all are usually quite complex (i.e. using an AWS account per developer) and still cannot be tested locally with much confidence. Chaos engineering has a time and a place. This is not it.


With all the possible permutations of deployments and account structures, security becomes a big problem. Good IAM practices are hard. Many engineers simply put a dynamodb:* for all resources in the account for a lambda function. (BTW this is not good). It becomes hard to manage all of these because developers can usually quite easily deploy and manage their own IAM roles and policies. And since it is hard to test locally, trying to fix serverless IAM issues requires deploying to AWS and testing (or breaking) in the environment.

Bad (Cult-like) Practices

No Fundamental Enforcement

Without help from frameworks, DRY (Don't Repeat Yourself), KISS (Keep It Simple Stupid) and other essential programming paradigms are simply ignored. In a perfect world, a team would reject PR's that do not abide by these basic principles. However, with the huge push for the cloud over the past several years, many junior developers have had the freedom to do what they want in the serverless space because of its ease of use; resulting in developers enmasse adopting something that doesn’t increase the health of the developer ecosystem as a whole. AWS gives you a knife by providing such an easy way to deploy code on the internet. Please don't hurt yourself with it.

Copy and Paste Culture

Most teams end up copying code to the new microservices and proliferating it across many services. I have seen teams with hundreds and even thousands of functions with nearly every function being different. This culture has gotten out of hand and now teams are stuck with these functions. Another symptom of this is not taking the time to provide a proper DNS.

DNS Migration Failures

Developers take the generic API Gateway generated DNS name ( and litter their code with it. There will come a time when the teams want to put a real DNS in front of it and now you're faced with locating the 200 different spots it was used to change it. And, it's not as easy as a Find/Replace. Searching like this can become a problem when you have a mix of hard-coded strings, parameterized/concatenated strings, and environment variables everywhere that DNS name lies. Oh and telemetry? Yeah that's nowhere to be found.

Microservice Hell

This isn't a post about microservices. However, as teams and developers can decide and add whatever they want into their YAML files for deployment, you end up with hundreds of dependent services and hundreds of repositories. Many have different approaches and/or have different CI/CD workflows. Also, I've found that repository structures begin widely diverging. Any perceived cost savings has now been moved to managing all of these deployments and repositories. Here are a few examples of how developers choose to break up their serverless functions by Git repositories:

  1. Use a monolith for all their API's.
  2. Separate each API Gateway or queue processors
  3. Separate "domain" (i.e. /customers or /invoices)
  4. Separate by endpoint (I have seen developers break out a repository for POST:/customers while maintaining a separate one for GET:/customers/:id and so on…).

Many times, developers switch between different styles and structures daily. This becomes a nightmare not only for day-to-day development, but also for any developer getting to a quick-understanding of how the code deploys and what dependencies it has or impacts.

API Responses

The serverless cult has been active long enough now that many newer engineers entering the field don't seem to even know about the basics of HTTP responses. Now there are many veteran developers lacking this knowledge. While this is not strictly a serverless problem, I never have experienced this much abuse outside of serverless. I've seen endpoints returning 200, 400, 500 like normal. Yet another set of endpoints return all 2xx responses, with a payload like:

  "status": "ERROR",
  "reason": "malformed input"
Enter fullscreen mode Exit fullscreen mode

Then, another set of endpoints implement inconsistent response patterns dependent on some permutation of query parameters. For example:
Query 1:

  "accountId": "1234",
  "firstName": "John",
  "lastName": "Newman"
Enter fullscreen mode Exit fullscreen mode

Query 2:

  "accountId": "1234",
  "firstName": "John",
  "lastName": "Newman"
Enter fullscreen mode Exit fullscreen mode

Inventing New Problems

As mentioned previously, initially deploying these types of services are easy. The reality is there are new problems with these kind of serverless structures that don't typically occur in server-backed services:

  1. Cold starts - many engineers don't care too much about this. But they suddenly start caring when Function A calls Function B which calls Function C and so on. Without some voodoo warm-up scripting solution, paying for provisioned concurrency, or ignoring it, you may be out of luck.
  2. In the past five years, prior to our work on our flooring installation marketplace, the teams I have been a part of have always chased the latest features because we had been doing workarounds like FIFO queues, state machines, provisioned concurrency, etc. As teams chase the latest features released by AWS (or your cloud provider of choice), things then become even harder to test and maintain since SAM or Localstack don't match these features for some time.
  3. Some awful custom eventing solution because… serverless. Engineers think simply putting an API Gateway in front of EventBridge will solve all their eventing problems. What about retries? What about duplicate events? What about replaying events? Schema enforcement? Where does the data land? How do I get the data? These are all questions that have to be answered or documented in a custom fashion. Ok, EventBridge supports a few of these things in some form but it does leave engineers chasing the latest features, waiting for these to become available. However, outside of the serverless cult, these issues can be solved with Kafka, NATS, or other technologies. Use the right tool.
  4. When it’s not okay to talk about the advantages and disadvantages of serverless with other engineers without fear of reprisal, it might be a cult. Many of these engineers say Lambda is the only way to deploy anymore. There isn't much thought to offline solutions when things need to be run onsite or separated from the cloud. For some companies this can be a fine approach. However, many medium to large organizations have (potentially) offline computing needs outside of the cloud. Lambda cannot provide a sensitive, remote pressure device real-time updates in the event of an internet outage in the middle of Canada during winter.

So, how do I get out of the cult?

In this article, I didn’t plan to address the many options you have to extricate yourself from the grips of mindless serverless abuse. If you're interested, please leave a comment and I will write a follow-up on the different solutions and alternatives to serverless I’ve found as well as some tips to incrementally shift back to a normal life. What I did want to do here was to express the pains I have experienced with serverless technologies over the past couple years now that I have helped architect more traditional VM and container-based tech stacks. I felt compelled to ensure individuals, teams, and organizations know what serverless can really mean for long-term sustainability in an environment.

What does serverless do well?

Deployment and scaling. That's really it for most organizations. For a lot of these organizations, it's hard to find the time, people, and money to figure out how to automatically provision new VM's, get access to a K8S cluster, etc. My challenge to you is to first fix your deployment and scaling problems internally before thinking about serverless compute.


Serverless is one of the hottest new cloud trends. However, I have found it leads to more harm than good in the long run. While I understand some of the problems listed above are not unique to serverless, they are much more prolific; leading engineers to spend most of their time with YAML configuration or troubleshooting function execution rather than crafting business logic. What I find odd is the lack of complaints from the community. If I’m alone in my assessment, I’d love to hear from you in the comments below. I’ve spent a significant amount of time over the last few years working to undo my own serverless mistakes as well as those made by other developers. Maybe I’m the one who has been brainwashed? Time will tell.

Discussion (16)

eliasbrange profile image
Elias Brange • Edited on

Most of the points here sounds like a result of bad communication and organization, and all of them could surface without Serverless as well. Having teams that build stuff without communicating on certain standards are bound to build services that don't interact very nicely with each other, regardless of if they are running in a Lambda, container or VM.

Depending directly on other teams functions?

Stop doing that, and expose each individual service as an API. Then it does not matter whether there are lambdas, containers or even VMs behind the API. Care must be taken to not break the APIs, and that needs to be done regardless of what kind of compute you are using.

10 different ways to authorize

Also boils down to communication between teams. If all services are exposed as APIs, it would be preferable to use a common auth mechanism for them. Here you could even centralize that function to a team and let them be in charge of an authorizer that other teams can use in API Gateways. For other services that does not run behind API gateway, expose the authorizer functionality as an API that can be called from inside containers in an API middleware or similar.

Account chaos

More an organizational issue. I wouldn't want to log in to a console where 20 teams have 4 EC2 instances each either. Use AWS Organizations and automate creation of accounts, preferably one account per service (or some similar scope).

DNS migration

Same goes for ALB, NLB and other services.

leading engineers to spend most of their time with YAML configuration

Kubernetes says hello.

brentmitchell profile image
Brent Mitchell Author

Hey there, Elias, appreciate the feedback! There is no doubt that team communication is critical, serverless or otherwise. My larger point was that focusing on business objectives (that move business forward) rather than technical ones should be the priority of every productive team.

Additionally, I probably could have been more clear that it doesn't matter whether a developer is hitting an API or lambda directly. The testing difficulty of serverless means many developers and teams are first testing in their dev and test environments. This means my functions/apis may start suddenly failing in dev because another team is making a change to their api, which is now broken for whatever reason. Again, this is because it is very hard to impossible to test anything locally.

As I mentioned, appreciate the feedback!

eliasbrange profile image
Elias Brange

Hello there! Great answer! :)

I totally agree on the part that testing distributed systems, where different teams are responsible for different services, is very hard. And I also agree on that all the serverless offerings still need a bit of work to mimic it perfectly locally. However, I feel that this problem would still be there even if the compute layer was running on something else.

How we solved it at my previous place was that every team had a development environment, where they experimented and stuff was not expected to always work. Other services were mocked where needed. We then had a staging and production deploys of every service where all services were expected to be stable, with integration suites testing and verifiying that the actual business usecases (often spanning multiple services) was working as expected.

Since all contracts between teams where defined by APIs, it didn't really matter what was running underneath. Some teams used API Gateway + Lambdas extensively, others used ALB + Fargate, and some teams used NLB + EC2.