How do I scale up my web application?

This is a very common question that software engineers especially backend developers are often asked. This is a very broad concept and it is not always necessary that the same answer should work for each and every app but I will try to explain it as best as I can in this article.

Before we move ahead, we first need to understand what scaling up really means. To understand it better, Let’s first understand how your server's resources are utilized. Servers are nothing but computers/CPUs which are connected to public networks. Each CPU is built with hardware components like processors, RAM, storage memory, etc. Each component has some sort of limitations. Let’s take an example of a processor, Each processor has n number of cores, Each core can run p number of processes and each process can utilize t number of threads. At peak, When all these resources are being utilized, it has very little to no processing power to execute more instructions and that’s when our computers start acting slow.

When our application makes a request, it goes to these servers, servers execute some set of instructions that we have written through our code (How that communication is done is a different topic). To execute those instructions, Our application definitely going to utilize all resources it is required. As the number of active users increases, the number of requests to our server increases, and hence the number of instructions that are to be executed increases, after a certain number of requests, more and more requests start failing. To ensure the maximum number of requests are served (increase performance), we need to make sure our server has enough resources available which is all about scaling up our application.

Few ways to scale up your application

Optimize resource utilization

Before you start thinking about scaling up your application, You first need to understand how resources are utilized by your application. There are many ways in which you can make sure your application is efficiently utilizing all available resources. Let’s understand a few of them:

  1. Optimize your code. Make sure your application is not unnecessarily using any resources. To understand it better, you need to understand time complexity which will give you a better idea of how much time our code will take to complete its execution. For example, a binary search can run much faster than a linear search in many cases, Using a hashmap instead of using a loop for every search can significantly give performance improvement. To come up with a better solution, you need to have a good understanding of data structures and algorithms.

  2. When using a database, Make sure your database queries are optimized which takes less time to run. To optimize DB queries, You can go with approaches like indexing, sharding, replication, etc. E.g. Reports may take some time to fetch data from the database. In such cases, you could use a replicated database node. Also, Reduces the number of queries to the database.

  3. Use async. With asynchronous programming, node.js gained a lot of fame for its improved performance. Nowadays a lot of web frameworks are adapting async-await functionalities for better performance. I would highly recommend you go through the following documentation to learn more about asynchronous code:

These are a few things you should consider while developing your application. There are more options like using the cache, Lambda function/queues for background tasks which should run on different worker servers, etc. When it comes to designing a system, There are so many techniques that you could refer to and choose the one that is suitable for your use case.

Increase resources

Even after following the best techniques while developing, there may come a point when your application starts slowing down due to hardware limitations which are described at the very start of this article. There are ways you can actually scale up your application with hardware changes.

  1. Vertical Scaling. To overcome hardware limitations, We could simply upgrade our hardware to have higher configurations. For example, If 2 cores are not enough, we could always use 4 core machines for our application. This technique is simply called vertical scaling.

  2. Horizontal Scaling. This method involves using multiple machines together to share application load. With the use of a load balancer at the top which could direct incoming requests to any one server.

  3. Hybrid approach. We could always use the above approaches in combination.


Software engineering is all about trial and error. Before you start asking yourself this question, Focus more on developing your application first. As you start facing more problems, You will always end up finding more solutions. Follow the basics, develop your application, and once it is completed start optimizing and thinking about scaling your application.