Scalable Architectures

Note: The terms 'machine' and 'node' could be used interchangeably in this section. A machine corresponds to a compute power (with memory and CPU) and need not necessarily mean a physical machine. It could also be a virtual machine.

Let us take a look at what is Application scaling and what are the different ways in which we can scale an application.

What is Application Scaling?

It is a process of adding (or removing) resources to an application, so that it can handle a growing (or shrinking) workload.

There are two ways to scale an application:

  1. Vertical Scaling
  2. Horizontal Scaling

Vertical scaling or 'Scale-In' or 'Scale Up/Down':

It is a type of scaling, where we add (or remove) CPU or memory to servers or machines.

Scale Up example: Increase the CPU from 4 to 8.

Scale Down example: Decrease memory from 1GB to 512MB.

  1. Easier to achieve.
  2. Requires limited changes to the application (since most modern applications support multi-threading, adding more CPUs should help).
  1. There is a limitation on how much we can really scale up, since it depends on the maximum CPU/Memory supported by the hardware.
  2. Scaling might require application down time.

Horizontal scaling or 'Scale-Out':

It is a type of scaling, where we add (or remove) machines and connect them together to form a distributed computing system. So instead of one huge super powered machine, we can use a large number of machines with lower configuration (also known as commodity hardware).

  1. No limitation on scaling capacity
  2. Scaling doesn’t require application down time.
  3. Inexpensive, because of lower configuration hardware

Distributed computing introduces its own set of challenges, like Fault tolerance etc. which are covered in a separate section.

Reference Implementations:

Most of the cloud providers today (like AWS, Cloud Foundry, OpenShift etc.) provide the ability to scale out as well as scale in.

