Scalable Architectures

Note: The terms 'machine' and 'node' could be used interchangeably in this section. A machine corresponds to a compute power (with memory and CPU) and need not necessarily mean a physical machine. It could also be a virtual machine.

Let us take a look at what is Application scaling and what are the different ways in which we can scale an application.

What is Application Scaling?

It is a process of adding (or removing) resources to an application, so that it can handle a growing (or shrinking) workload.

There are two ways to scale an application:

Vertical Scaling
Horizontal Scaling

Vertical scaling or 'Scale-In' or 'Scale Up/Down':

It is a type of scaling, where we add (or remove) CPU or memory to servers or machines.

Scale Up example: Increase the CPU from 4 to 8.

Scale Down example: Decrease memory from 1GB to 512MB.

Benefits:

Easier to achieve.
Requires limited changes to the application (since most modern applications support multi-threading, adding more CPUs should help).

Challenges:

There is a limitation on how much we can really scale up, since it depends on the maximum CPU/Memory supported by the hardware.
Scaling might require application down time.

Horizontal scaling or 'Scale-Out':

It is a type of scaling, where we add (or remove) machines and connect them together to form a distributed computing system. So instead of one huge super powered machine, we can use a large number of machines with lower configuration (also known as commodity hardware).

Benefits:

No limitation on scaling capacity
Scaling doesn’t require application down time.
Inexpensive, because of lower configuration hardware

Challenges:

Distributed computing introduces its own set of challenges, like Fault tolerance etc. which are covered in a separate section.

Reference Implementations:

Most of the cloud providers today (like AWS, Cloud Foundry, OpenShift etc.) provide the ability to scale out as well as scale in.

Scalable Architecture