• notice
  • Congratulations on the launch of the Sought Tech site

Load balancing, high availability and scalability architecture of TCP access layer

Today, I will systematically talk with you about TCP's load balancing, high availability, and scalability architecture.

How is the load balancing, high availability, and scalability architecture of web-server implemented?
In the Internet architecture, web-server access generally uses nginx as a reverse proxy to implement load balancing. The entire architecture is divided into three layers:
(1) Upstream calling layer , usually browser or APP;
(2) Intermediate reverse proxy layer , nginx;
(3) The downstream real access cluster , web-server, common web-servers include tomcat, apache;
What was the whole visit process like?
(1) The browser initiates a request to daojia.com;
(2) DNS resolves daojia.com to the external network IP (;
(3) The browser accesses nginx through the external network IP (;
(4) nginx implements load balancing strategies, common strategies include polling, random, IP-hash, etc.;
(5) nginx forwards the request to the web-server of the intranet IP (;
Due to the short HTTP connection and the stateless nature of web applications, in theory any HTTP request that falls on any web-server should be handled normally .
Voice-over: If it must fall on one, the architecture may be unreasonable and it is difficult to scale horizontally.
The question is, tcp is a stateful connection . Once the client and server establish a connection, a request initiated by a client must fall on the same tcp-server. How to do load balancing at this time and how to ensure horizontal expansion?
Option 1: Stand-alone tcp-server
A single tcp-server can obviously guarantee request consistency:
(1) The client initiates a tcp request to tcp.daojia.com;
(2) DNS resolves tcp.daojia.com to the external network IP (;
(3) The client initiates a request to the tcp-server through the external network IP (;
What are the disadvantages of this scheme?
High availability cannot be guaranteed.
Option 2: Cluster method tcp-server
You can ensure high availability by building a tcp-server cluster , and the client can achieve load balancing :
(1) The external network IP of three tcp-servers tcp1/tcp2/tcp3.daojia.com is configured in the client;
(2) The client selects the tcp-server in a "random" way, assuming that tcp1.daojia.com is selected;
(3) Resolve tcp1.daojia.com through DNS;
(4) Connect the real tcp-server through the external network IP;
How to ensure high availability?
If the client finds that a tcp-server cannot be connected, it will choose another one.
What are the disadvantages of this scheme?
Before each connection, one more DNS access needs to be implemented:
(1) It is difficult to prevent DNS hijacking;
(2) One more DNS access means longer connection time , and this deficiency is more obvious on the mobile phone;
How to solve DNS problems?
Directly configuring the IP on the client can solve the above two problems, which is what many companies do, commonly known as " IP through train ".
What's new about the "IP through train"?
The IP is hard-coded on the client, and the load balancing is implemented on the client, which has poor scalability :
(1) If the original IP changes , the client will not be notified in real time;
(2) If a new IP is added , that is, tcp-sever expansion, the client will not get real-time notification;
(3) If the load balancing strategy changes , the client needs to be upgraded;
Option 3: The server implements load balancing
Only by sinking complex strategies to the server can fundamentally solve the problem of scalability .
It is a good solution to add an http interface and put the client's "IP configuration" and "balance strategy" on the server :
(1) Each time the client accesses the tcp-server, it calls a new get-tcp-ip interface . For the client, this http interface only returns the IP of one tcp-server;
(2) This http interface implements the IP balancing strategy of the original client;
(3) After getting the IP of tcp-server, initiate a long TCP connection to tcp-server as before;
In this case, the scalability problem is solved:
(1) If the original IP changes, you only need to modify the configuration of the get-tcp-ip interface;
(2) If an IP is added, the configuration of the get-tcp-ip interface is also modified;
(3) If the load balancing strategy changes, there is no need to upgrade the client;
However, a new problem has arisen. If all IPs are placed on the client, when one IP fails, the client can switch to another IP connection to ensure availability, while the get-tcp-ip interface only maintains static tcp- The server cluster IP is completely unaware of whether the tcp-server corresponding to these IPs is available . What should I do?
Solution 4: tcp-server status report
How does the get-tcp-ip interface know whether each server in the tcp-server cluster is available? Active reporting by tcp-server is a potential solution . If a tcp-server hangs, the reporting will be terminated. For tcp that stops reporting -server, get-tcp-ip interface, will not return the external network IP of the corresponding tcp-server to the client.
Problems with this design?
It is true that status reporting solves the problem of high availability of tcp-server, but this design makes a small coupling error of "reverse dependency" : tcp-server depends on a web-server that has nothing to do with its own business.
Option 5: tcp-server status pull
A better solution is: web-server obtains the status of each tcp-server by "pulling" , rather than tcp-server reporting its own status by "push".
In this case, each tcp-server is independent and decoupled , and only needs to focus on the senior tcp business functions.

Tasks such as high availability, load balancing, and scalability are performed exclusively by the web-server of get-tcp-ip .
To put it another way , there is another advantage of implementing load balancing on the server side , which can achieve load balancing of heterogeneous tcp-servers and overload protection :
(1) Static implementation : The IP of multiple tcp-servers under the web-server can be configured with load weights, and the load is allocated according to the machine configuration of the tcp-server (nginx also has similar functions);
(2) Dynamic implementation : the web-server can dynamically allocate the load according to the state of the tcp-server "pulled" back, and implement overload protection when the performance of the tcp-server drops sharply;
How does web-server implement load balancing?
Utilize nginx reverse proxy for round-robin, random, ip-hash.
How does tcp-server quickly ensure request consistency?
How to ensure high availability?
The customer configures multiple tcp-server domain names.
How to prevent DNS hijacking, and speed up?
IP through train, the client is configured with multiple tcp-server IPs.
How to ensure scalability?
The server provides the get-tcp-ip interface , which shields the load balancing strategy from the client screen and implements convenient expansion.
How to ensure high availability?
tcp-server "pushes" state to get-tcp-ip interface,
The get-tcp-ip interface "pulls" the tcp-server state.

Details are important, ideas are more important than details .


Technical otaku

Sought technology together

Related Topic


Leave a Reply