• notice
  • Congratulations on the launch of the Sought Tech site

[High Concurrency] The interviewer asked me how to use Nginx to implement current limiting

write in front

Recently, many readers have said that they have learned a lot of knowledge after reading my articles. In fact, I am very happy after hearing it. It is indeed a happy thing that my own writing can help everyone. Recently, there are also many small partners who have successfully received offers from big companies after reading my articles. Many small partners have been reading my articles to improve their internal skills and eventually become the core business developers of their own companies. Here, Glacier is really happy for you. I hope you can keep learning as always, keep a continuous learning mentality, and go further and further on the road of technology.

What are you writing today? After thinking about it, write an article about high concurrency in practice. Yes, let’s write an article on how to use Nginx to implement current limiting. If you want to read any articles, you can leave me a message on WeChat, or leave a message directly on the official account.

Current limiting measures

If you have read the " [High Concurrency] High Concurrency Spike System Architecture Decryption, not all spikes are spikes! "In the article, I believe my friends will remember what I said: when introducing the seckill system in many articles and posts on the Internet, it is said that asynchronous peak shaving is used to perform some current limiting operations when placing an order, which is all nonsense! Because the ordering operation is a relatively late operation in the entire seckill system process, the current limiting operation must be pre-processed. It is useless to do the current limiting operation in the process behind the seckill business.

As a high-performance web proxy and load balancing server, Nginx is often deployed in the front-end location of some Internet applications. At this point, we can set up on Nginx to restrict the IP address and concurrent number of access accordingly.

Nginx official current limiting module

The official version of Nginx restricts IP connection and concurrency with two modules:

  • limit_req_zone is used to limit the number of requests per unit time, that is, rate limit, using the leaky bucket algorithm "leaky bucket".

  • limit_req_conn is used to limit the number of connections at the same time, that is, the concurrency limit.

limit_req_zone parameter configuration

limit_req_zone parameter description

Syntax: limit_req zone=name [burst=number] [nodelay];
Default:    —
Context:    http, server, location
limit_req_zone $binary_remote_addr zone=one:10m rate=1r/s;
  • The first parameter: $binary_remote_addr indicates that the remote_addr logo is used for restriction. The purpose of "binary_" is to abbreviate the memory footprint and limit the ip address of the same client.

  • The second parameter: zone=one:10m means to generate a memory area with a size of 10M and the name one, which is used to store the frequency information of access.

  • The third parameter: rate=1r/s indicates that the access frequency of clients with the same identity is allowed. The limit here is 1 time per second, and there can be, for example, 30r/m.

limit_req zone=one burst=5 nodelay;
  • The first parameter: zone=one sets which configuration zone to use for the limit, which corresponds to the name in limit_req_zone above.

  • The second parameter: burst=5, focus on this configuration, the meaning of burst burst, this configuration means to set a buffer with a size of 5. When a large number of requests (bursts) come, requests that exceed the access frequency limit can be placed in this buffer first.

  • The third parameter: nodelay, if set, when the access frequency is exceeded and the buffer is full, 503 will be returned directly. If not set, all requests will be queued.

limit_req_zone example

http {
    limit_req_zone $binary_remote_addr zone=one:10m rate=1r/s;
    server {
        location /search/ {
            limit_req zone=one burst=5 nodelay;
        }
}

The following configuration can restrict access to specific UAs (such as search engines):

limit_req_zone  $anti_spider  zone=one:10m   rate=10r/s;
limit_req zone=one burst=100 nodelay;
if ($http_user_agent ~* "googlebot|bingbot|Feedfetcher-Google") {
    set $anti_spider $http_user_agent;
}

Other parameters

Syntax: limit_req_log_level info | notice | warn | error;
Default:    
limit_req_log_level error;
Context:    http, server, location

When the server is throttled or cached due to the limit, the configuration is written to the log. Delayed records are one level lower than rejected records. Example: limit_req_log_level noticeThe delayed base is info.

Syntax: limit_req_status code;
Default:    
limit_req_status 503;
Context:    http, server, location

Sets the return value for rejected requests. Values can only be set between 400 and 599.

ngx_http_limit_conn_module parameter configuration

ngx_http_limit_conn_module parameter description

This module is used to limit the number of requests for a single IP. Not all connections are counted. Connections are only counted when the server has processed the request and has read the entire request header.

Syntax: limit_conn zone number;
Default:    —
Context:    http, server, location
limit_conn_zone $binary_remote_addr zone=addr:10m;
 
server {
    location /download/ {
        limit_conn addr 1;
    }

Only one connection per IP address is allowed at a time.

limit_conn_zone $binary_remote_addr zone=perip:10m;
limit_conn_zone $server_name zone=perserver:10m;
 
server {
    ...
    limit_conn perip 10;
    limit_conn perserver 100;
}

Multiple limit_conn directives can be configured. For example, the above configuration will limit the number of connections to the server per client IP, while also limiting the total number of connections to the virtual server.

Syntax: limit_conn_zone key zone=name:size;
Default:    —
Context:    http
limit_conn_zone $binary_remote_addr zone=addr:10m;

Here, the client IP address is used as the key. Note that instead of $remote_addr, the $binary_remote_addr variable is used. The size of the $remote_addr variable can vary from 7 to 15 bytes. The stored state occupies 32 or 64 bytes of memory on 32-bit platforms, and always occupies 64 bytes on 64-bit platforms. The size of the $binary_remote_addr variable is always 4 bytes for IPv4 addresses and 16 bytes for IPv6 addresses. Stored state always occupies 32 or 64 bytes on 32-bit platforms and 64 bytes on 64-bit platforms. A megabyte region can hold about 32,000 32-byte states or about 16,000 64-byte states. If the zone storage is exhausted, the server returns an error to all other requests.

Syntax: limit_conn_log_level info | notice | warn | error;
Default:    
limit_conn_log_level error;
Context:    http, server, location

Set the desired logging level when the server limits the number of connections.

Syntax: limit_conn_status code;
Default:    
limit_conn_status 503;
Context:    http, server, location

Sets the return value for rejected requests.

Nginx current limit actual combat

Limit access rate

limit_req_zone $binary_remote_addr zone=mylimit:10m rate=2r/s;
server { 
    location / { 
        limit_req zone=mylimit;
    }
}

The above rule limits the access speed of each IP to 2r/s, and applies the rule to the root directory. What happens if a single IP sends multiple requests concurrently in a very short period of time?

We made and sent 6 requests within 10ms using a single IP, only 1 succeeded and the remaining 5 were rejected. The speed we set is 2r/s, why is only 1 successful, is it wrong for Nginx? Of course not, because Nginx's current limit statistics are based on milliseconds. The speed we set is 2r/s. If we convert it, a single IP is only allowed to pass one request within 500ms, and the second request is allowed to pass from 501ms.

burst cache processing

We saw that we sent a large number of requests in a short period of time, Nginx counted according to millisecond precision, and requests that exceeded the limit were directly rejected. This is too harsh in actual scenarios. In the real network environment, the arrival of requests is not uniform, and it is very likely that there will be "burst" requests, that is, "one by one". Nginx takes this situation into consideration, and can use the burst keyword to enable cache processing of burst requests instead of rejecting them directly.

Take a look at our configuration:

limit_req_zone $binary_remote_addr zone=mylimit:10m rate=2r/s;
server { 
    location / { 
        limit_req zone=mylimit burst=4;
    }
}

We added burst=4, which means that each key (here, each IP) allows up to 4 burst requests. What if a single IP sends 6 requests in 10ms?

Compared with the number of successes in the first instance, the number of bursts has increased by 4. The number of bursts we set is the same. The specific processing flow is: 1 request is processed immediately, 4 requests are put into the burst queue, and the other request is rejected. Through the burst parameter, we enable Nginx to limit the flow with the ability to cache and handle burst traffic.

But please note: the role of burst is to allow redundant requests to be placed in the queue first and processed slowly. If the nodelay parameter is not added, the requests in the queue will not be processed immediately, but will be processed slowly with millisecond precision according to the speed set by the rate.

nodelay reduces queuing time

In the use of burst cache processing, we can see that by setting the burst parameter, we can allow Nginx cache to process a certain degree of burst, and redundant requests can be placed in the queue first and processed slowly, which plays a role in smoothing traffic. However, if the queue setting is relatively large, the request queuing time will be relatively long, and from the user's point of view, the RT will become longer, which is very unfriendly to users. What is the solution? The nodelay parameter allows the request to be processed immediately when it is queued, that is to say, as long as the request can enter the burst queue, it will be processed by the background worker immediately. Please note that this means that when the burst is set to nodelay, the instantaneous QPS of the system may exceed The threshold set by rate. The nodelay parameter is only effective when used with burst.

Continuing the configuration of burst cache processing, we add the nodelay option:

limit_req_zone $binary_remote_addr zone=mylimit:10m rate=2r/s;
server { 
    location / { 
        limit_req zone=mylimit burst=4 nodelay;
    }
}

A single IP sends 6 requests concurrently within 10ms, and the results are as follows:

Compared with burst cache processing, the request success rate has not changed, but the overall time is shortened. How can this be explained? In the burst cache processing, 4 requests are placed in the burst queue, the worker process takes a request for processing every 500ms (rate=2r/s), and the last request is queued for 2s before being processed; here, the request is placed Enqueuing is the same as burst cache processing, but the difference is that the requests in the queue are eligible to be processed at the same time, so the five requests here can be said to be processed at the same time, and the time spent is naturally shortened.

However, please note that although setting burst and nodelay can reduce the processing time of burst requests, it will not increase the upper limit of throughput in the long run. The upper limit of long-term throughput is determined by rate, because nodelay can only guarantee burst requests. are processed immediately, but Nginx will limit the rate at which queue elements are released, just like the rate at which tokens in the token bucket are generated.

Seeing this, you may ask, which "bucket" is the rate limiting algorithm after adding the nodelay parameter, is it a leaky bucket algorithm or a token bucket algorithm? Of course it's a leaky bucket algorithm. Consider a situation, what will happen when the token of the token bucket algorithm is exhausted? Since it has a request queue, it will cache the next request, and the cache is somewhat limited by the size of the queue. But does it still make sense to cache these requests at this point? If the server is overloaded, the cache queue is getting longer and longer, and the RT is getting higher and higher, even if the request is processed after a long time, it is of little value to the user. So when the token is not enough, the most sensible way is to directly reject the user's request, which becomes the leaky bucket algorithm.

custom return value

limit_req_zone $binary_remote_addr zone=mylimit:10m rate=2r/s;
server { 
    location / { 
        limit_req zone=mylimit burst=4 nodelay;
        limit_req_status 598;
    }
}

By default no status is configured for the status return value:


Tags

Technical otaku

Sought technology together

Related Topic

1 Comments

author

lipitor 10mg brand & lt;a href="https://lipiws.top/"& gt;lipitor order online& lt;/a& gt; atorvastatin price

Xqjkwg

2024-03-07

Leave a Reply

+