• notice
  • Congratulations on the launch of the Sought Tech site

Dry goods sharing! MySQL slow query practice analysis summary

MySQL slow query, the full name is slow query log, is a log record provided by MySQL, used to record statements in MySQL whose response time exceeds the threshold. We will introduce the static state, and you can refer to it if you need it.


Why do you want to do this

1 What is slow SQL?

This refers to MySQL slow query, specifically refers to SQL that runs longer than the value of long_query_time.

We often hear that there are binary log binlog, relay log relaylog, redo rollback log redolog, undolog, etc.in common MySQL. For slow queries, there is also a slow log, slowlog, which is used to record statements whose response time exceeds the threshold in MySQL.

Don't be misled by the name slow query, thinking that the slow query log will only record select statements, but it will also record insert, update and other DML statements whose execution time exceeds the threshold set by long_query_time.

# Check whether slow SQL is enabledshow variables like "slow_query_log%";# View the threshold unit of the slow query setting: secondsshow variables like "long_query_time";

For the AliSQL-X-Cluster that we use, namely XDB, slow query is enabled by default, and long_query_time is set to 1 second.

2 Why does slow query cause failure?

Real slow SQL is often accompanied by a large number of row scans, temporary file sorting, or frequent disk flushes.The direct impact is the increase in disk IO, normal SQL has also become slow SQL, and a large area of execution timeouts.

After Double 11 last year, in response to the problems exposed on the technical side, the rookie CTO line launched a number of special governance, CTO-D each took one as the sponsor, and my large team was responsible for the special project of slow SQL governance.

Second, to what extent

1 How to measure the severity of slow SQL in an application?

Micro-average

sum (the number of slow SQL executions of aone application)-----------------------sum (number of executions of aone application SQL)

We believe that the larger the value, the greater the impact; the smaller the value, the impact may be small.

The extreme case is that the SQL executed every time in the application is all slow SQL, and the value is 1; the SQL executed every time in the application is not slow SQL, and the value is 0.

However, the problem that this indicator brings is poor discrimination, especially for SQL QPS is very high and SQL is not a slow query in most cases, and occasional slow SQL will be overwhelmed.

Another question, is the occasional slow SQL really slow SQL? We encountered a lot of SQL logged by slow query logs.In fact, it may be affected by other slow SQL, MySQL disk jitter, optimizer selection and other reasons.It is obvious that the performance of regular queries is not slow SQL becomes slow SQL.

Macro average

sum (slow SQL 1 execution count) sum (slow SQL n execution count)----------------- + ------------------sum (SQL 1 execution count) sum (SQL n execution count)---------------------------------------                   n

This algorithm is based on a certain number of executions of the caught slow SQL, which can reduce the impact of fake slow SQL.

When the QPS of some applications is very low, that is, the number of times SQL is executed a day is very small, if it encounters false SQL, it will cause statistical errors.

Number of executions

sum (the number of slow SQL executions of aone application)-----------------------           7

Counting the average number of slow SQL executions per day in the last week can eliminate the false SQL problem caused by macro averaging.

Number of slow SQL templates

The above dimensions have a time limit.In order to trace the historical processing of slow SQL, we have also introduced the global slow SQL template quantity dimension.

count(distinct(aone application slow SQL template))

2 goals

  • Core application: solve all the slow SQL

  • Ordinary application: the micro-average index drops by 50%

3 CTO report

Take CTO-D as the unit and summarize the weighted average of the application according to the above multi-dimensional indicator statistics, and get the ranking from low to high, highlight the top 3 at the beginning and the end, and broadcast it weekly.

3.Why do I do it

I guess it may be related to my background.I have a C/C++ background.I was responsible for the design and implementation of the company-level multi-active architecture at the previous company.I have a better understanding of MySQL.

In addition, it may have nothing to do with interests.My small team has just started its business, and there is no slow SQL, so it can be inserted into various business lines.

Four action support

1 Group MySQL Protocol

Excerpts from the Index Specification:

[Mandatory] Join is prohibited for more than three tables. The data types of the fields that need to be joined are absolutely consistent; when multi-table related queries, ensure that the related fields need to have indexes.

Note: Even double-table join should pay attention to table index and SQL performance.

[Mandatory] When creating an index on a varchar field, the index length must be specified.It is not necessary to index the entire field.The index length is determined according to the actual text discrimination.

Note: The length of the index and the degree of discrimination are a pair of contradictions.Generally, for string type data, the index with a length of 20 will have a degree of discrimination as high as 90% or more.You can use count(distinct left(column name, index length))/ The degree of discrimination of count(*) is determined.

[Mandatory] Page search is strictly prohibited left fuzzy or full fuzzy, if necessary, please go to the search engine to solve.

Note: The index file has the leftmost prefix matching feature of B-Tree.If the value on the left is not determined, this index cannot be used.

[Recommendation] Prevent the implicit conversion caused by different field types, leading to index failure.

[Reference] Avoid the following extreme misunderstandings when creating an index:

1) The index is not lacking

Think that a query needs to build an index.

2) The creation of stingy indexes

It is believed that the index will consume space and seriously slow down the update and new addition speed.

3) Resistance to unique indexes

It is believed that the unique index must be solved in the application layer through the "check before insert" method.

2 DB change standard

DDL needs to control the speed of changes, pay attention to grayscale and concurrency control, and release changes within the specified change release window.

Five share some examples of my participation in optimization

1 Uneven data distribution

640.webp.jpg

640.webp (1).jpg

1) Unreasonable sub-database and sub-table

The business data is divided into 8 databases, and each database is divided into 16 tables.By viewing the table space, you can see that the data is almost distributed in two tables in each database. There is a problem with the strategy of sub-database and sub-metering.In addition, the business increment is estimated too high.This has reservations.

2) The index is unreasonable

A single table created a joint index of idx_logistics_corp_id_special_id, but even then the discrimination is still too low.According to experiments and business feedback (logistics_corp_id, transport_type_id) field combination discrimination is very high, and the business has a single check scenario of transport_type_id.

640.webp (2).jpg

2 Indexing issues

SELECT  COUNT(0) AS `tmp_count`FROM(    SELECT      `table_holder`.`user_id`,      `table_holder`.`sc_item_id`,      SUM(        CASE          `table_holder`.`inventory_type`          WHEN 1 THEN `table_holder`.`quantity`          ELSE 0        END      ) AS `saleable_quantity`,      SUM(        CASE          `table_holder`.`inventory_type`          WHEN 1 THEN `table_holder`.`lock_quantity`          ELSE 0        END      ) AS `saleable_lock_quantity`,      SUM(        CASE          `table_holder`.`inventory_type`          WHEN 401 THEN `table_holder`.`quantity`          ELSE 0        END      ) AS `transfer_on_way_quantity`,      `table_holder`.`store_code`,      MAX(`table_holder`.`gmt_modified`) AS `gmt_modified`    FROM      `table_holder`    WHERE(`table_holder`.`is_deleted` = 0)      AND(`table_holder`.`quantity`> 0)      AND `table_holder`.`user_id` IN(3405569954)      AND `table_holder`.`store_code` IN('ZJJHBHYTJJ0001','...1000 more')    GROUP BY      `table_holder`.`user_id`,      `table_holder`.`sc_item_id`    ORDER BY      `table_holder`.`user_id` ASC,      `table_holder`.`sc_item_id` ASC  ) `a`;

The table corresponding to this case has a store_code index, so I think it is okay and there is no way to optimize it. In fact, through the execution plan, we found that MySQL chose a full table scan. Practice for this case found that when the number of range queries exceeds 200, the index optimizer will no longer use the field index.

Finally, after pulling the relevant query SQL for the most recent period of time, combined with the data distribution of the business, we found that the solution can be solved by using (is_deleted, quantity).

Determine the length of the index used in the execution plan: the length calculation formula of key_len (>=5.6.4)

char(10) allows NULL = 10 * (character set: utf8mb4=4,utf8=3,gbk=2,latin1=1) + 1(NULL)char(10) does not allow NULL = 10 * (character set: utf8mb4=4,utf8=3,gbk=2,latin1=1)varchr(10) allows NULL = 10 * (character set: utf8mb4=4,utf8=3,gbk=2,latin1=1) + 1(NULL) + 2 (variable length field)varchr(10) does not allow NULL = 10 * (character set: utf8mb4=4,utf8=3,gbk=2,latin1=1) + 2 (variable length field)int allows NULL = 4 + 1 (NULL)int does not allow NULL = 4timestamp allows NULL = 4 + 1 (NULL)timestamp does not allow NULL = 4datatime allows NULL = 5 + 1 (NULL)datatime does not allow NULL = 5

3 Being influenced by others

The index was used but it was still exposed to scan 20 million rows:

640.webp (3).jpg

The index field is highly distinguished:

640.webp (4).jpg

At the same time, regular SQL became a slow query:

640.webp (5).jpg

DB data disk access situation:

640.webp (6).jpg

Investigating the situation of other instances of the shared physical machine, I found that there is a library that has a lot of slow SQL that needs to be sorted near the problem time, and the temporary file is written to just 2GB:

640.webp (7).jpg

Multiple MySQL instance leader nodes are deployed on the same physical machine.Although the CPU, MEM and other resources are isolated through docker, the isolation of buffer io has not yet been achieved.

640.webp (8).jpg

4 Unsolved

By summarizing and analyzing high-frequency queries and combining business to obtain appropriate indexes can often solve the daily slow queries, but this is not a panacea.

For example, it is possible that the index increases more and more, and even becomes like this:

640.webp (9).jpg

In some scenarios, such as supporting multiple field combination queries, and there are no required items, it is obviously unreasonable to support the index through the index.

640.webp (10).jpg

In query scenarios, it is a good habit to set fields with a higher degree of discrimination as required items; when there are many query combinations, consider using a storage or search engine with better search support.

Six daily processing

With the in-depth management of each CTO-D line, various indicators have been greatly improved compared with the previous ones.For example, the core application completes the slow query zeroing, the most influential slow SQL is solved, and the ranking of my team is also changed.The first tail top3 enters the head top3.
Slow SQL governance has entered daily routines.Through weekly fixed push of slow SQL work orders, the owner takes over the processing, and the statement, basically forming the habit and atmosphere of regular clearing, and the special slow SQL governance project has also been named and praised many times.

Seven summary

This is a late summary.Looking back, I feel that the process of strategy formulation, problem analysis and resolution is still worth sharing with you.

Related recommendation: "mysql tutorial"

The above is dry goods sharing! For detailed content of MySQL slow query practice analysis and summary, please pay attention to other related articles on gxlcms!


Tags

Technical otaku

Sought technology together

Related Topic

0 Comments

Leave a Reply

+