• notice
  • Congratulations on the launch of the Sought Tech site

How to optimize server performance in high concurrency scenarios?

write in front

Recently, a small partner asked in the group: How to set the tcp_nodelay parameter in the Linux system? Some friends also asked me. So today, we will talk about how to optimize the performance of the server in high concurrency scenarios based on this question.

In fact, the tcp_nodelay parameter is not configured at the operating system level, but the tcp_nodelay parameter is added to the TCP socket to turn off the sticky packet algorithm so that the data packet can be delivered immediately. The tcp_nodelay parameter is mainly for TCP sockets. For server hardware, if we want it to support millions or even tens of millions of concurrency, how should we optimize it?

Articles have been included in:



operating system

Here, the operating system I use is CentOS 8, we can enter the following command to check the version of the operating system.

CentOS Linux release 8.0.1905 (Core)

For high concurrency scenarios, we mainly optimize the network performance of the operating system. In the operating system, there are many parameters related to network protocols. For the optimization of server network performance, we mainly tune these system parameters to improve the performance. Our app accesses performance purposes.

System parameters

In the CentOS operating system, we can view all system parameters through the following commands.

/sbin/sysctl -a

Part of the output is shown below.

There are too many parameters here, about a thousand or so. In high concurrency scenarios, it is impossible for us to tune all the parameters of the operating system. We focus more on network-related parameters. If you want to get the parameters related to the network, then we first need to get the type of the operating system parameter, the following command can get the type of the operating system parameter.

/sbin/sysctl -a|awk -F "." '{print $1}'|sort -k1|uniq

The result information output by running the command is shown below.


The net type is the operating system parameter related to the network that we want to pay attention to. We can get the subtypes under the net type as follows.

/sbin/sysctl -a|grep "^net."|awk -F "[.| ]" '{print $2}'|sort -k1|uniq

The output result information is as follows.


In the Linux operating system, these network-related parameters can be modified in the /etc/sysctl.conf file. If these parameters do not exist in the /etc/sysctl.conf file, we can modify them in the /etc/sysctl.conf file. add these parameters.

Among the subtypes of the net type, the subtypes we need to focus on are: core and ipv4.

Optimize socket buffers

If the server's network socket buffer is too small, it will cause the application to read and write multiple times to process the data, which will greatly affect the performance of our program. If the network socket buffer is set large enough, the performance of our program can be improved to a certain extent.

We can get information about the server socket buffer by entering the following command on the server's command line.

/sbin/sysctl -a|grep "^net."|grep "[r|w|_]mem[_| ]"

The output result information is as follows.

net.core.rmem_default = 212992net.core.rmem_max = 212992net.core.wmem_default = 212992net.core.wmem_max = 212992net.ipv4.tcp_mem = 43545        58062   87090net.ipv4.tcp_rmem = 4096        87380   6291456net.ipv4.tcp_wmem = 4096        16384   4194304net.ipv4.udp_mem = 87093        116125  174186net.ipv4.udp_rmem_min = 4096net.ipv4.udp_wmem_min = 4096

Among them, the keywords with max, default and min represent: the maximum value, the default value and the minimum value respectively; those with the keywords mem, rmem and wmem are: total memory, receive buffer memory, and send buffer memory .

It should be noted here that the unit with the rmem and wmem keywords is "byte", and the unit with the mem keyword is "page". "Page" is the smallest unit of memory management by the operating system. In Linux system, the default page size is 4KB.

How to optimize frequent sending and receiving of large files

If in a high concurrency scenario, large files need to be sent and received frequently, how can we optimize the performance of the server?

Here, the system parameters we can modify are as follows.


Here, we make an assumption, assuming that the system can allocate a maximum of 2GB of memory to TCP, the minimum value is 256MB, and the pressure value is 1.5GB. Calculated according to a page of 4KB, the minimum value, pressure value and maximum value of tcp_mem are 65536, 393216 and 524288 respectively, and the unit is "page".

If the average data packet of each file is 512KB, and each socket read and write buffer can accommodate at least 2 packets each, 4 packets can be accommodated by default, and 10 packets can be accommodated at most, then we can calculate The minimum value, default value, and maximum value of tcp_rmem and tcp_wmem are 1048576, 2097152, and 5242880 respectively, and the unit is "byte". While rmem_default and wmem_default are 2097152, rmem_max and wmem_max are 5242880.

Note: The details of how these values are calculated will be described later~~

Here, it should also be noted that the buffer exceeds 65535, and the net.ipv4.tcp_window_scaling parameter needs to be set to 1.

After the above analysis, our final system tuning parameters are as follows.

net.core.rmem_default = 2097152net.core.rmem_max = 5242880net.core.wmem_default = 2097152net.core.wmem_max = 5242880net.ipv4.tcp_mem = 65536  393216  524288net.ipv4.tcp_rmem = 1048576  2097152  5242880net.ipv4.tcp_wmem = 1048576  2097152  5242880

Optimize TCP connections

Friends who have a certain understanding of computer networks know that the TCP connection needs to go through "three-way handshake" and "four-time wave", as well as a series of technologies that support reliable transmission, such as slow start, sliding window, sticky packet algorithm, etc. support. Although, these can guarantee the reliability of the TCP protocol, but sometimes this will affect the performance of our program.

So, in high concurrency scenarios, how do we optimize TCP connections?

(1) Turn off the sticky packet algorithm

If the user is very sensitive to the time-consuming of the request, we need to add the tcp_nodelay parameter to the TCP socket to turn off the sticky packet algorithm so that the data packet can be sent immediately. At this point, we can also set the parameter value of net.ipv4.tcp_syncookies to 1.

(2) Avoid frequent creation and recycling of connection resources

The creation and recycling of network connections is very performance-intensive. We can optimize server performance by closing idle connections and reusing allocated connection resources. You are not unfamiliar with the reuse of allocated connection resources, such as: thread pools and database connection pools reuse threads and database connections.

We can close the idle connection of the server and reuse the allocated connection resources through the following parameters.


(3) Avoid sending data packets repeatedly

TCP supports timeout retransmission mechanism. If the sender has sent the data packet to the receiver, but the sender has not received feedback, at this time, if the set time interval is reached, the TCP timeout retransmission mechanism will be triggered. To avoid sending successfully sent packets again, we need to set the server's net.ipv4.tcp_sack parameter to 1.

(4) Increase the number of server file descriptors

In the Linux operating system, a network connection also occupies a file descriptor. The more connections, the more file descriptors it occupies. If the file descriptor is set relatively small, it will also affect the performance of our server. At this point, we need to increase the number of server file descriptors.

For example: fs.file-max = 10240000, which means that the server can open up to 10240000 files.


Technical otaku

Sought technology together

Related Topic


Leave a Reply