<> preface 
 In daily development , We often use order by, Dear little friend , Do you know order by  How does it work ?order by What is the optimization idea of ? use order 
by What should we pay attention to ? This article will learn with you , Conquer order by~
 <> One use order by  Simple example of 
 Suppose an employee table is used , The table structure is as follows :
CREATE TABLE `staff` ( `id` BIGINT ( 11 ) AUTO_INCREMENT COMMENT ' Primary key id', 
`id_card` VARCHAR ( 20 ) NOT NULL COMMENT ' ID number ', `name` VARCHAR ( 64 ) NOT 
NULL COMMENT ' full name ', `age` INT ( 4 ) NOT NULL COMMENT ' Age ', `city` VARCHAR ( 64 ) 
NOT NULL COMMENT ' city ', PRIMARY KEY ( `id`), INDEX idx_city ( `city` ) ) ENGINE 
= INNODB COMMENT ' Employee table '; 
 Table data are as follows :
 We have such a need now : Before query 10 individual , Name of employee from Shenzhen , Age , city , And sorted by age . Corresponding  SQL  Statement can be written like this :
select name,age,city from staff where city = ' Shenzhen ' order by age limit 10; 
 The logic of this statement is very clear , But what is its underlying execution process ?
 <>order by  working principle 
 <>explain  Execution plan 
 Let's use it first Explain Keyword check the execution plan 
 *  Implement the plan key This field , Indicates that the index is used idx_city
 * Extra  Of this field  Using index condition  Indicates the index condition 
 * Extra  Of this field  Using filesort Indicates that sorting is used  
 We can find , This one SQL Index used , And sorting is also used . So how is it sorted ?
 <> Full field sort 
MySQL  A small memory will be allocated to each query thread , For sorting , be called  sort_buffer. When to put the fields into sorting , Actually through idx_city
 Index to find the corresponding data , Just put the data in .
 Let's review how the index finds matching data , Now draw the index tree first ,idx_city The index tree is as follows :
idx_city Index tree , Leaf nodes store primary keys id.  There's another one id Primary key clustering index tree , Let's draw a family index tree again :
 How do our query statements find matching data ? Pass first idx_city Index tree , Find the corresponding primary key id, And then get the primary key id, search id Primary key index tree , Find the corresponding row data .
 add order by after , The overall implementation process is :
 * MySQL  Initialize for the corresponding thread sort_buffer, Put in the required query name,age,city field ;
 *  From index tree idx_city,  Find the first satisfaction  city=' Shenzhen ’ Primary key of condition  id, That is, in the figure id=9;
 *  To primary key  id  Get the index tree id=9 Data in this row of ,  take name,age,city Values of three fields , Save to sort_buffer;
 *  From index tree idx_city  Get the primary key of the next record  id, That is, in the figure id=13;
 *  Repeat step  3,4  until city The value of is not equal to Shenzhen ;
 *  front 5 Step has found all city Data for Shenzhen , stay  sort_buffer in , Put all data according to age Sort ;
 *  Take the top according to the sorting result 10 Line returned to client . 
 The execution diagram is as follows :
 Read all the fields required by the query to sort_buffer in , Full field sorting 
. In here , Some friends may have a question , Put all the fields of the query into sort_buffer, and sort_buffer It's a piece of memory , If the amount of data is too large ,sort_buffer What if I can't let go ?
 <> Disk temporary file secondary sort 
 actually ,sort_buffer The size of is controlled by a parameter :sort_buffer_size. If the data to be sorted is less than sort_buffer_size, Sort in 
sort_buffer  Complete in memory , If the data to be sorted is greater than sort_buffer_size, Then sort with the help of disk files 
 How to determine whether disk files are used for sorting ?  You can use the following commands 
##  open optimizer_trace, Turn on statistics  set optimizer_trace = "enabled=on"; ##  implement SQL sentence  
select name,age,city from staff where city = ' Shenzhen ' order by age limit 10; ## 
 Query output statistics  select * from information_schema.optimizer_trace 
 Can be from  number_of_tmp_files  See in , Whether temporary files are used .
number_of_tmp_files  Indicates the number of temporary disk files sorted using . If number_of_tmp_files>0, Indicates that disk files are used for sorting .
 Disk temporary file used , What about the whole sorting process ?
 *  From primary key Id Index tree , Get the data you need , And put sort_buffer Memory 
 In block . When sort_buffer When it is almost full , Right sort_buffer Data sorting in , After platoon , Temporarily put the data into a small file on the disk .
 *  Continue back to primary key  id  Index tree fetching data , Continue to put sort_buffer In memory , After sorting , Also write these data to the temporary small file on the disk .
 *  Continue the cycle , Until all the data that meet the conditions are retrieved . Finally, a small file that temporarily arranges the disk , Merge into an orderly large file . 
TPS:  Sort temporary small files with disk , In fact, the merging sorting algorithm is used .
 My friends may have a question , since sort_buffer
 can't let go , You need to use temporary disk files , This will affect the sorting efficiency . Then why sort irrelevant fields (name,city) put to sort_buffer Medium ? Only sort related age field , it 
 Doesn't it smell good ?  You can understand rowid  sort .
 <>rowid  sort 
rowid  Sort is , Query only SQL Fields and primary keys required for sorting id, put to sort_buffer in . So how do you decide whether to use full field sorting or rowid  Sort sort ?
 Actually, there is a parameter control . This parameter is max_length_for_sort_data
, It means MySQL A parameter used to sort the length of row data , If the length of a single line exceeds this value ,MySQL  I think the single line is too big , Just change rowid  sort . We can see the value of this parameter through the command .
show variables like 'max_length_for_sort_data'; 
max_length_for_sort_data  The default value is 1024. Because in this example name,age,city length =64+4+64 =132 < 1024, 
 So it's full field sorting . Let's change this parameter , Make it smaller ,
##  The maximum single row length of modified sorting data is 32 set max_length_for_sort_data = 32; ##  Execute query SQL select 
name,age,city from staff where city = ' Shenzhen ' order by age limit 10; 
 use rowid  Sort words , whole SQL What is the implementation process ?
 * MySQL  Initialize for the corresponding thread sort_buffer, Put in the to be sorted age field , And primary key id;
 *  From index tree idx_city,  Find the first satisfaction  city=' Shenzhen ’ Primary key of condition  id, That is, in the figure id=9;
 *  To primary key  id  Get the index tree id=9 Data in this row of ,  take age And primary key id Value of , Save to sort_buffer;
 *  From index tree idx_city  Get the primary key of the next record  id, That is, in the figure id=13;
 *  Repeat step  3,4  until city The value of is not equal to Shenzhen ;
 *  front 5 Step has found all city Data for Shenzhen , stay  sort_buffer in , Put all data according to age Sort ;
 *  Traversal sorting results , Take before 10 that 's ok , And according to  id  The value of is returned to the original table , take out city,name  and  age  Three fields are returned to the client . 
 The execution diagram is as follows :
 Compare the process of full field sorting ,rowid  Return to the table after sorting one more time .
 What is return table ? Get the primary key and return to the process of primary key index query , It's called back to the table 
 We passed optimizer_trace, You can see whether it is used rowid Sorted :
##  open optimizer_trace, Turn on statistics  set optimizer_trace = "enabled=on"; ##  implement SQL sentence  
select name,age,city from staff where city = ' Shenzhen ' order by age limit 10; ## 
 Query output statistics  select * from information_schema.optimizer_trace 
 <> Full field sorting and rowid Sort comparison 
 *  Full field sort : sort_buffer If there is not enough memory , Disk temporary files are needed , Cause disk access .
 * rowid sort : sort_buffer Can put more data , But you need to go back to the original table to get data , Return to the table once more than full field sorting . 
 Normally , about InnoDB Storage engine , Priority will be given to full field sorting . Can be found  max_length_for_sort_data 
 The parameter is set to 1024, This number is relatively large . Normally , The sorting field will not exceed this value , That is to say, you can sort all fields .
 <>order by Some optimization ideas of 
 How do we optimize order by What about sentences ?
 *  Because the data is out of order , So you need to sort . If the data itself is ordered , Then there's no need to row . The index data itself is ordered , We build a joint index , optimization order by  sentence .
 *  We can also adjust max_length_for_sort_data Isoparametric optimization ; 
 <> Joint index optimization 
 Review the example again SQL Query plan for 
explain select name,age,city from staff where city = ' Shenzhen ' order by age limit 
10; 
 We give query conditions city And sort fields age, Add a union index idx_city_age. Check the implementation plan again 
alter table staff add index idx_city_age(city,age); explain select 
name,age,city from staff where city = ' Shenzhen ' order by age limit 10; 
 Can be found , add idx_city_age Joint index , No need Using filesort Sorted . Why? ? Because the index itself is ordered , We can have a look 
idx_city_age Schematic diagram of joint index , as follows :
 whole SQL The execution process turns into maozi :
 *  From index idx_city_age Find satisfaction city=' Shenzhen ’  Primary key of  id
 *  To primary key  id Index takes out the whole row , Get  name,city,age  Values of three fields , Return directly as part of the result set 
 *  From index idx_city_age Remove the primary key of a record id
 *  Repeat step  2,3, Until No 10 Records , Or dissatisfied city=' Shenzhen ’  Cycle ends when condition . 
 The flow diagram is as follows :
 From the schematic diagram , There is still a table return operation . For this example , Is there a more efficient solution ? yes , we have , Can use overlay index :
 Overlay index : In the data column of the query , There is no need to go back to the table to check , You can get the desired result directly from the index column . let me put it another way , you SQL Index column data used , Columns that overwrite query results , Even if the index is overwritten .
 We give city,name,age  Form a joint index , The overlay index is available , At this time SQL When executing , Even the operation of returning to the table can be omitted .
 <> Adjustment parameter optimization 
 We can also adjust parameters , De optimization order 
by Implementation of . For example, it can be adjusted sort_buffer_size Value of . because sort_buffer Value is too small , If there is a large amount of data , Will sort with the help of temporary files on disk . If MySQL If the server configuration is high , You can use a slightly larger adjustment .
 We can also adjust max_length_for_sort_data Value of , If this value is too small ,order 
by Can walk rowid sort , Return table , Reduce query performance . therefore max_length_for_sort_data It can be a little larger .
 of course , Many times , these MySQL Parameter value , We can just use the default value directly .
 <> use order by  Some points for attention 
 <> No, where condition ,order by Does the field need to be indexed 
 During daily development , May we meet where Conditional order by, that , At this time order 
by Do the following fields need to be quoted . If there is such a SQL,create_time Whether to add index :
select * from A order by create_time; 
 Unconditional inquiry , even if create_time Index on , Will not be used . because MySQL The optimizer thinks that ordinary secondary indexes should be used , The cost of going back to the table is higher than that of scanning and sorting the whole table . So choose to scan the whole table , Then sort according to the full field or rowid Sort .
 If query SQL Modify it :
select * from A order by create_time limit m; 
 * 
 Unconditional query , If m Smaller value , It can be indexed . because MySQL The optimizer believes that , Check the data in the table according to the index order , Then get m Pieces of data , You can terminate the cycle , Then the cost is smaller than that of full table scanning , Then select secondary index .
 <> paging limit When it is too large , What if a large number of sorting will be caused ?
 hypothesis SQL as follows :
select * from A order by a limit 100000,10 
 *  Can record the last of the previous page id, When querying on the next page , Query criteria with id, as : where id >  Last of previous page id limit 10.
 *  It can also be used when the business allows , Limit pages . 
 <> Index storage order and order by atypism , How to optimize ?
 Suppose there is a joint index  idx_age_name,  We need to change it to this : Before query 10 Names of employees , Age , And sorted by age , If the age is the same , In descending order by name . Corresponding  
SQL  Statement can be written like this :
select name,age from staff order by age ,name desc limit 10; 
 Let's look at the implementation plan , Found to use Using filesort.
 that is because ,idx_age_name In the index tree ,age Sort from small to large , If age identical , Press again name Sort from small to large . and order by  in , Yes press age Sort from small to large , If 
age identical , Press again name Sort from large to small . in other words , Index storage order and order by atypism .
 How can we optimize it ? If MySQL yes 8.0 edition , support Descending Indexes, You can modify the index in this way :
CREATE TABLE `staff` ( `id` bigint(11) NOT NULL AUTO_INCREMENT COMMENT ' Primary key id', 
`id_card` varchar(20) NOT NULL COMMENT ' ID number ', `name` varchar(64) NOT NULL 
COMMENT ' full name ', `age` int(4) NOT NULL COMMENT ' Age ', `city` varchar(64) NOT NULL 
COMMENT ' city ', PRIMARY KEY (`id`), KEY `idx_age_name` (`age`,`name` desc) USING 
BTREE ) ENGINE=InnoDB AUTO_INCREMENT=15 DEFAULT CHARSET=utf8 COMMENT=' Employee table '; 
 <> Used in When conditions have multiple attributes ,SQL Whether there is a sorting process in execution 
 If we have a joint index idx_city_name, Execute this SQL If , You won't go through the sorting process , as follows :
select * from staff where city in (' Shenzhen ') order by age limit 10; 
 however , If used in condition , And there are multiple conditions , There will be a sorting process .
 explain select * from staff where city in (' Shenzhen ',' Shanghai ') order by age limit 10; 
 that is because :in There are two conditions , When meeting Shenzhen ,age It's in order , But to meet the needs of Shanghai age Add in , There is no guarantee that all the age They are all in good order . Therefore, it is necessary to Using 
filesort.
 <> last 
 If you think this article is helpful to you , Remember to like it + Collection + forward !!!
Technology