A Load Balancer distributes traffic or incoming requests across multiple resources (i.e., data nodes or compute nodes).
Why do we need a Load Balancer?
Let us say we have X nodes (or machines), which either store same data (i.e. Replicated data) or perform same computation. We can keep adding additional similar nodes. Now, when a request comes in (either to perform data operation or to perform computation), we need to route this request to one of these nodes.
A Load Balancer performs the above mentioned job of routing requests. It prevents a scenario where in, one single resource or node is overloaded with excess work, whereas the other nodes are idle. A Load Balancer also assists in failover implementation, as they will not route requests to failed nodes. A Load Balancer maintains a registry of all the nodes and it would send heartbeats to each of the nodes regularly, to check if a node is alive and will update its registry accordingly. It will also dynamically update its registry if a node is added or removed.
A Load Balancer generally routes requests based on one of the below criteria:
- Round robin – In this mode, the nodes are selected one after another in a round robin basis.
- Priority – In this mode, some nodes are given preference over other.
- Least connections – In this mode, the request would be routed to the node having least number of connections.
Load Balancer Types:
Load Balancers could be categorized into two categories and each of the categories could be further divided into different types as below.
- Hardware Load Balancer
- Software Load Balancer
- Server side Load Balancer
- Client side Load Balancer
1. Hardware Load Balancer:
Hardware Load Balancer is a preconfigured machine with processor, memory, Operating System etc., optimized for high performance and the Load Balancer software installed on it.
F5 is a well-known hardware Load balancer.
2. Software Load Balancer:
Software Load Balancer comprises of just the Load Balancer software and the buyer would have to install and configure it on their own. When compared to Hardware Load Balancer, the right memory and Processor configurations would have to be configured for optimal performance.
HAProxy is a well-known software Load balancer
1. Server side Load Balancer:
In this type of load balancing, the Load balancing and routing happens on server side. We can use more than one server side Load Balancer either in a master/slave or a clustered mode. The client need not even know that it is hitting a Load Balancer. Load Balancing would be transparent to the client and the load balancer acts a reverse proxy.
Consul is a server side Load balancer
2. Client side Load Balancer:
In this type of load balancing, the Load balancing and routing happens on client side. The client should be aware of all the nodes and their addresses. The client uses an algorithm to select a suitable node to route the request.
Eureka is a client side Load balancer
Client Side Vs Server Side Load Balancing:
Advantages of Client Side over Server Side:
Since load balancing happens on client side, there is no single point of failure, whereas addition of a huge number of clients might burden the server side load balancer.
Disadvantages of Client Side over Server Side:
- Each client will be aware of the load distribution on the nodes originating from that client alone and it will not know the overall load on any node. So, the overall load distribution on the nodes might not be even and could be biased.
- The client needs to know about all the nodes and when any node is added or removed, all the clients need to be notified.
- Additional burden on client to perform load balancing.
Despite of its disadvantages, Client side load balancers are hugely popular in microservices based cloud native applications, where huge number of clients keep getting added as consumers for a service, and server side load balancing might introduce single point of failure.