Note: The terms 'machine' and 'node' could be used interchangeably in this section. A machine corresponds to a compute power (with memory and CPU) and need not necessarily mean a physical machine. It could also be a virtual machine.
Let us take a look at what is Application scaling and what are the different ways in which we can scale an application.
What is Application Scaling?
It is a process of adding (or removing) resources to an application, so that it can handle a growing (or shrinking) workload.
There are two ways to scale an application:
- Vertical Scaling
- Horizontal Scaling
Vertical scaling or 'Scale-In' or 'Scale Up/Down':
It is a type of scaling, where we add (or remove) CPU or memory to servers or machines.
Scale Up example: Increase the CPU from 4 to 8.
Scale Down example: Decrease memory from 1GB to 512MB.
- Easier to achieve.
- Requires limited changes to the application (since most modern applications support multi-threading, adding more CPUs should help).
- There is a limitation on how much we can really scale up, since it depends on the maximum CPU/Memory supported by the hardware.
- Scaling might require application down time.
Horizontal scaling or 'Scale-Out':
It is a type of scaling, where we add (or remove) machines and connect them together to form a distributed computing system. So instead of one huge super powered machine, we can use a large number of machines with lower configuration (also known as commodity hardware).
- No limitation on scaling capacity
- Scaling doesn’t require application down time.
- Inexpensive, because of lower configuration hardware
Distributed computing introduces its own set of challenges, like Fault tolerance etc. which are covered in a separate section.
Most of the cloud providers today (like AWS, Cloud Foundry, OpenShift etc.) provide the ability to scale out as well as scale in.