• notice
  • Congratulations on the launch of the Sought Tech site

Difference analysis and advice of mysql database char and varchar

In the database, the character data is the most, which can account for more than 80% of the entire database.For this reason, the correct handling of character data has a great effect on improving the performance of the database.Among the character data, the two most used are Char and Varchar.The front one is fixed length, while the back one is variable length.Now what we need to consider is under what circumstances to use Char character data and under what circumstances to use Varchar character data.

First, the difference between VARCHAR and CHAR character data

In the MySQL database, the most commonly used character data types are Varchar and Char.Although these two data types are used to store character data, there is a big difference between the two in terms of the structure and the way the data is stored.And its specific implementation also depends on the storage engine.Here I will take the MYISAM storage engine most commonly used by you as an example to talk about the difference between these two data types.In the follow-up recommendations, it is also for this type of storage.

The first thing that needs to be understood here is that for these two data types, no matter which storage is used, the way the system stores data is different.It is precisely because of this that we need to study the difference between the two.Then, in the right circumstances, adopt the appropriate method.After understanding this, let's look at the follow-up content.

Varchar is often used to store variable-length strings.Simply put, we just fixed a maximum value for it, and then the system will allocate suitable storage space according to the amount of data actually stored.For this reason, compared with CHAR character data, it can take up less storage space than the fixed-length type.However, in actual work, due to special reasons for a certain department, exceptions will be set here.For example, the administrator can specify the ROW_FORMAT=FIXED option as needed.If you use this option to create a MyISAM table, the system will use a fixed length of space for each row.This will cause a loss of storage space.Under normal circumstances, the VARCHAR data type can save disk space, and for this reason, it is often thought that it can improve the performance of the database.But it should be noted here that this is often a double-edged sword.While it improves performance, it often produces some side effects.For example, because its length is variable, it may cause some extra work when the data is updated.For example, before the change, the character length is 10 (the maximum number of characters specified by Varchar is assumed to be 50), at this time the system only allocates 10 storage locations (assuming that the system's own overhead is not considered).After the change, its data volume reached 20 bits.Since the maximum limit of 50 bits is not exceeded, the database still allows it to be stored for this reason.It's just that its original storage location can no longer meet its storage needs.At this time, the system needs to perform additional operations.Depending on the storage engine, some will use a split mechanism, and some will use a paging mechanism.

The CHAR data type is different from the VARCHAR data type in that it uses a fixed-length storage method.Simply put, the system always allocates the largest storage space for it.When data is saved, even if it does not reach the maximum length, the system will allocate so much storage space for it.Obviously, this storage method will cause a waste of disk space.One thing I need to remind here is that when the number of characters is insufficient, the system will not use spaces to fill.On the contrary, if when saving a CHAR value, if there is a null value after it, the system will automatically filter its spaces.When comparing data, the system will fill the space to the end of the string.

Obviously, the biggest difference between the two character data types of VARCHAR and CHAR is that the former is of variable length, while the latter is of fixed length.When storing, the former will allocate the final storage space based on the actual stored data.Regardless of the actual length of the stored data, the latter allocates storage space according to the length specified by CHAR.Does this mean that the data type of CHAR is inferior to VARCHAR? actually not.Otherwise, there is no need for the CHAR character type.Although the VARCHAR data type can save storage space and improve the efficiency of data processing.However, some of the negative effects brought about by its variable length sometimes offset the advantages it brings.For this reason, in some cases, it is still necessary to use the Char data type.

Second, project suggestions

According to the above analysis, we know that the VARCHAR data type is a double-edged sword.While it brings performance improvements, there may also be some additional consumption.We need to balance when evaluating whether to use the VARCHAR data type or the CHAR data type.In actual projects, we will consider the following conditions.

One is to judge based on the length of the character.For example, a field, like a person's name, has a limited maximum length.For example, we can assign it a length of 18 characters.At this time, although the length of each person's name may be different, even if a fixed-length character type, that is, a length of 18 characters, is assigned to them, the final wasted space is not very large.And if the NVARCHAR data type is used, in case it needs to be renamed later, and the original storage space is not enough to accommodate the new value, it will cause some extra work.In this case, when balancing, it will be considered better to use CHAR fixed-length data types.In actual projects, if the character length of a certain field is relatively short, a fixed character length is generally used.

The second is to consider whether the lengths are similar.If the length of a certain field is relatively long, but its length is always similar, such as generally between 90 and 100 characters, or even the same length.At this time, it is more suitable to use the CHAR character type.A typical application is MD5 hash value.When the MD5 hash value is used to store user passwords, the CHAR character type is used very much.Because the length is the same.In addition, it is generally recommended to use CHAR type data, such as used to store the user's ID number.

In addition, please consider a question.What is the difference between the two definitions of CHAR(1) and VARCHAR(1)? Although both of these can only be used to store a single character, VARCHAR takes up one more storage location than CHAR.This is mainly because when the VARCHAR data type is used, an extra byte is used to store the length information.There is no administrative overhead for the CHAR character type.

The third is to consider from the perspective of fragmentation.When the CHAR character type is used, the storage space is allocated all at once.For this reason, the contents of a certain field are all stored together.From this perspective alone, there is no problem of fragmentation.For variable-length character data types, the stored length is variable.When the data length is inconsistent before and after the change, the problem of fragmentation will inevitably occur.Therefore, when using variable-length character data, the database administrator needs to defragment from time to time.Such as performing database export and import operations to eliminate fragmentation.

Fourth, even if you use the Varchar data type, you cannot be too generous.What does it mean? For example, the user now needs to store an address information.According to evaluation, as long as 100 characters are used.But some database administrators will think that, anyway, Varchar data types are allocated length according to actual needs.Might as well give it a bigger one.To this end, they may allocate 200 characters of storage space for this field at a time.Is this VARCHAR(100) and VARCHAR(200) really the same? The result is negative.Although they are used to store 90 characters of data, their storage space is the same.But the memory consumption is different.For the VARCHAR data type, although the storage space on the hard disk is allocated according to the actual character length, it is not for the memory.At this time, a fixed-size memory block is used to store the value.Simply put, it is to use the length defined in the character type, that is, 200 character spaces.Obviously, this will have a relatively large adverse effect on sorting or temporary tables (all of which need to be implemented in memory).Therefore, if certain fields are involved in file sorting or disk-based temporary tables, the allocation of VARCHAR data types still cannot be too generous.Still have to evaluate the actual length required, and then choose a longest field to set the character length.If you want to consider redundancy, you can leave about 10% of the character length.It must not be considered that the storage space is allocated according to the actual length, and the length is allocated at will, or simply the maximum character length is used.

Tags

Technical otaku

Sought technology together

Related Topic

1 Comments

author

atorvastatin 10mg pills & lt;a href="https://lipiws.top/"& gt;order atorvastatin 40mg pills& lt;/a& gt; cost lipitor 40mg

Nccfun

2024-03-08

Leave a Reply

+