I saw an optimization case today and thought it was quite representative.I recorded it here and made a mark to commemorate the problem of random field definitions.
Recall that in the design of tables, many people are accustomed to designing the table structure as Varchar(64), Varchar(255), etc., although in most cases only 5-15 bytes are stored.Then I Take a look at the following case.
The table (client_id, channel) is a composite index.
Use explain to take a look at the execution plan, it looks perfect for index usage
Look at the actual implementation:
The actual execution situation is very bad.The idea of communication, this execution is very perfect from the execution plan of the index, it seems that it has nothing to do with it.Looking at the design of the table, you will find that the client_id is also designed as
varchar(255).If you see this, you can use the following method to try it out:
From the perspective of the execution plan, it's almost the same, but it's actually much worse.Specifically, the key_len has dropped from 1032 to 258, the execution plan has become a const-based search based on equality, and the number of rows has increased from 10 million to 100,000..I can understand IO
Save a lot.
Let's look at the actual implementation:
Wow, from 11.69 seconds to 0.25 seconds, what is the concept, how many times has been optimized, let's count.
What are you thinking about when you see this, remember this case, well, yes, you can add quotation marks to optimize it later.Then why not ask, can it be optimized, why is it so?
Let's take a look at the first question:
Can it be optimized?
The answer is of course.From the length of the index, 258 is still a very large data.For the client_id field, from the point of view of the name, it can only store the value of the data type, so why not use an int unsigned to store it,
The length of the index will soon drop from 258 to 4.Wouldn't that save a lot of money?
Let's take a look at the second question, why is this so?
There are two reasons for this, both based on one principle, cost-based optimizers.For client_id defined as a character value when the table is defined, and a numeric value is passed in when querying, it needs to undergo a numerical conversion, the beginning of the tragedy, and finally
Causes MySQL to select a completed index to scan.
What do we need to pay attention to in this case?
Reasonable selection of data types, basic work is too important, this is called winning at the starting line, everything can't be casual, don't define a table as a table that is all Varchar(255) except the main building.Be careful when indexing fields such as double/float in the database.