Tips for flip pages via mysql optimization discuz!
Written in the front: discuz! As the leading community system, it provides a one-stop website solution for webmasters, and it is open source (although some of the code is encrypted), which contributes to the development of this vertical industry.made a huge contribution.Nevertheless, there are more or less pits in the discuz! system source code.Among them, the most famous is the default use of MyISAM engine, and the function of grabbing a building based on the MyISAM engine, the session table uses the memory engine, etc., you can refer to the following historical articles.This time we are going to talk about another problem in the logic function of discuz! in response to the page-turning function of popular posts.
In our environment, version-5.6.6 is used.
In the process of viewing posts and turning pages, SQL similar to the following will be generated:
The cost of this SQL execution is:
--The number of row records accessed according to the index is generally considered a good state
--The number of times the next row of records is accessed according to the index order, usually because of the range scan based on the index, or the full index scan, which is generally considered a good state
--The total number of times the row records were read in a certain order.This value is usually larger if the results need to be sorted.When a full table scan or multi-table join cannot use the index, the value will also be larger
And when you need to turn back many pages when encountering a hot post, for example:
The cost of this SQL execution becomes (you can see that Handler_read_key and Handler_read_rnd are much larger):
| Handler_read_key | 129876 |--because many lines need to be skipped in front
| Handler_read_next | 329881 |--same as above
| Handler_read_rnd | 129875 |--because a large result set needs to be sorted first
It can be seen that when encountering hot posts, the price of this SQL will be very high.If the hot post is replied to by a large number of access history, or is repeatedly requested by the search engine and the page is historically replied, it is easy to directly overwhelm the database server.
Summary: The reason why this SQL cannot be sorted using the `displayorder` index is that the second column `invisible` of the index uses a range query (RANGE), which makes it impossible to continue to use the joint index to complete the sorting requirements of the `dateline` field (And if it is WHERE tid=? AND invisible IN(?, ?) AND dateline=? In this case, the entire joint index can be used completely, pay attention to the difference between the two).
Knowing this reason, the corresponding optimization solution is also clear:
Create a new index idx_tid_dateline, which only includes two columns, tid and dateline (according to the statistics of other indexes, the cardinality of item_type and item_id is too low, so it is not included in the joint index.Of course, it can also be considered together.plus).
Let's take a look at the execution plan after using the new index:
As you can see, the previously existing Using filesort has disappeared, and the sorting can be done directly through the index.
However, if the hot post is turned to an older historical reply, the corresponding SQL still cannot use the new index:
Compare the execution plan if the optimizer is suggested to use the new index:
It can be seen that because the query optimizer thinks that the number of rows to be scanned by the latter is far more than 110,000, it is considered that the former is more efficient.
In fact, in this example, sorting is more expensive, so we want to eliminate sorting first, so we should force the use of a new index, that is, use the following execution plan, and specify the index in the corresponding program.
Finally, let's take a look at the comparison of the profiling statistics of the two execution plans when the hot post turns to a very old historical reply:
1.Use the old index (displayorder):
2.If a new index (idx_tid_dateline) is used:
It can be seen that the efficiency has been improved to a certain extent, but it is not obvious, because the amount of data that needs to be scanned is indeed larger, so the Sending data stage takes more time.
At this time, we can refer to a previous optimization scheme: [MySQL optimization case] series-paging optimization
Then this SQL can be rewritten as follows:
Look at the profiling statistics for this SQL:
It can be seen that the efficiency has been increased by more than 1 times, which is still quite good.
Finally, this problem will only appear when the hot posts are turned over.Generally, posts with only 1 or 2 pages of replies are fine if the original execution plan is still adopted.
Therefore, it is recommended that discuz! Officially revise or add a new index, and judge in the code whether the hot posts are paged.If yes, force the use of the new index to avoid performance problems.
0 Comments