1 What is? MVCC 

      MVCC The full name is : Multiversion concurrency control
, Multi version concurrency control , Provides concurrent access to the database , Process the read to memory in the transaction , It is used to avoid the concurrent problem of read operation blocked by write operation .

 

      for instance , programmer A Reading some content from the database , And programmers B These are being revised ( Suppose it is modified within a transaction , Probably last 10s about ),A Here it is 10s within
You may see inconsistent data , stay B Before submission , How to let A The data that can be read all the time is consistent ?

 

      There are several ways to deal with it , The first one :
Lock based concurrency control , programmer B When you start modifying data , Lock the data , programmer A Now read again , You can't read it , In a waiting situation , Can only wait B Read data only after operation , This guarantee A No inconsistent data will be read , But this will affect the efficiency of the program . There is another kind :MVCC, When each user connects to the database , All you see are database snapshots at a particular time , stay B Before the transaction of was committed ,A What you always read is a snapshot of the database at a particular time , I won't read it B Data modification in transactions , until B Transaction commit , Will be read B Modification of .

      

     
One support MVCC Database for , When updating some data , Not using new data to overlay old data , Instead, it marks old data as obsolete , At the same time, add a data version elsewhere . therefore , Multiple versions of the same data are stored , But only one is up to date .

 

      MVCC Provided Time consistent
Handling ideas , stay MVCC When reading down a transaction , Usually a timestamp or transaction is used ID To determine which state of the database to access and which version of the data . Read and write transactions are isolated from each other , It doesn't affect each other . Assuming the same data , Existing read transaction access , There are also write transaction operations , actually , Write transactions create a new version of the data , Read transactions access the old version of the data , Until the write transaction is committed , The new version of the data will be accessed by the read transaction .

 

     
MVCC There are two ways to do it , The first is to store multiple versions of data records in the database , When these different versions of data are no longer needed , The garbage collector collects these records . This way is PostgreSQL and Firebird/Interbase use ,SQL
Server Similar mechanisms used , The difference is that the old version data is not stored in the database , It is stored in a different database than the main database tempdb in . In the second way, only the latest version of data is saved in the database , But it will be used undo Dynamic reconstruction of old version data , This way is Oracle and MySQL/InnoDB use .

 

2,InnoDB Of MVCC Implementation mechanism

  MVCC It can be considered as a variant of row level lock , It can avoid locking operation in many cases , So the cost is lower .MVCC Most of them implement non blocking read operation , Write operations also lock only the necessary rows .InnoDB Of MVCC realization , This is achieved by saving a snapshot of the data at a certain point in time .
A business , No matter how long it takes , The internal data is consistent . In other words, transactions do not interact with each other during execution . Let's briefly describe it MVCC stay InnoDB Implementation in .

  InnoDB Of MVCC,
This is achieved by saving two hidden columns after each row of records : A time to create a saved row , The expiration time of a saved line ( Delete time ), of course , The time here is not a timestamp , It's the system version number , Every time you start a new transaction , The system version number is incremented
. stay RR Under isolation level ,MVCC The operation of :

* select operation .
* InnoDB Find only versions older than ( Contains equal to ) Data row of the current transaction version . Rows that can be guaranteed to be read by the transaction , Either the transaction exists before it starts , Or records inserted or modified by the transaction itself .
* The deleted version of the row is either undefined , Either greater than the current transaction version number . Rows that can be guaranteed to be read by the transaction , Not deleted before the transaction started .
* insert operation . Save the newly inserted row with the current version number as the row version number .
* delete operation . Save the deleted line with the current version number as the deletion ID .
*
update operation . Become insert and delete Combination of operations ,insert The current version number is the row version number ,delete The current version number is saved to the original line as the deletion ID .
   Because the old data is not really deleted , So we have to clean up the data ,innodb A background thread is opened to perform the cleanup , The specific rule is to delete the lines whose version number is less than the current system version
, This process is called purge.

 

3, Simple little examples
create table yang( id int primary key auto_increment, name varchar(20)); }
   Suppose the version number of the system is from 1 start .

INSERT

  InnoDB Save the current system version number as the version number for each newly inserted line . 
   First business ID by 1;
start transaction; insert into yang values(NULL,'yang') ; insert into yang
values(NULL,'long'); insert into yang values(NULL,'fei'); commit;
   The table corresponding to the data is as follows ( The last two columns are hidden columns , We don't see it through the query statement )

 

SELECT
 InnoDB Each row of records is checked against the following two conditions :

  a.InnoDB Only data rows with a version earlier than the current transaction version are found ( that is , The system version number of the row is less than or equal to the system version number of the transaction ), This ensures that the rows read by the transaction , Or it exists before the transaction starts , Either the transaction itself has been inserted or modified .
  b. The deleted version of the row is either undefined , Either greater than the current transaction version number , This ensures the rows read by the transaction , Was not deleted before the transaction started .
  only a,b Record of simultaneous satisfaction , Can be returned as the query result .

 

DELETE
 InnoDB The version number of the current system is saved for each line deleted ( Transactional ID) As deletion ID .
   Take a look at the specific examples below :
   The second transaction ,ID by 2;
start transaction; select * from yang; //(1) select * from yang; //(2) commit;
  hypothesis 1

   Suppose the transaction is being executed ID by 2 In the process of , It's just been implemented (1), At this time , There is another business ID by 3 Insert a piece of data into this table ; 
   The third business ID by 3;
start transaction; insert into yang values(NULL,'tian'); commit;
   The data in the table is as follows :

   The transaction is then executed 2 Medium (2), because id=4 The creation time of the data for ( affair ID by 3), That executes the current transaction ID by 2, and InnoDB Only transactions will be found ID Less than or equal to current transaction ID Data rows for , therefore id=4 The data row of is not executing the transaction 2 Medium (2) It was retrieved , In business 2 Two of them select
The data retrieved by the statement will only be shown in the following table :

  hypothesis 2

   Suppose the transaction is being executed ID by 2 In the process of , It's just been implemented (1), Suppose the transaction has finished executing the transaction 3 after , Then the transaction was executed 4; 
   Fourth business :
start transaction; delete from yang where id=1; commit;
   At this point, the tables in the database are as follows :

   The transaction is then executed ID by 2 Business of (2), according to SELECT
Search conditions can be known , It retrieves the creation time ( Creating transactional ID) Less than current transaction ID The row and deletion time of ( Delete transactional ID) Rows larger than current transaction , and id=4 It's been said , and id=1 Due to deletion time of ( Delete transactional ID) Greater than the current transaction ID, So business 2 Of (2)select
* from yang Will also id=1 We're going to retrieve the data . therefore , affair 2 Two of them select The data retrieved from the statement are as follows :

 

UPDATE
  InnoDB implement UPDATE, It's actually a new line of records , And save its creation time as that of the current transaction ID, Save the current transaction at the same time ID To get to UPDATE The deletion time of the row for .

  hypothesis 3
   Suppose the transaction is finished 2 Of (1) And then it was implemented , Another user executed the transaction 3,4, At this time , The user has another table to execute UPDATE operation :
   The first 5 Transactions :
start transaction; update yang set name='Long' where id=2; commit;
   according to update Principle of renewal : A new line is generated , And add this transaction to the delete time column of the original column to be modified ID, The table is as follows :

   Continue the transaction 2 Of (2), according to select The search condition of statement , Get the following table :

   Or business 2 in (1)select We get the same result .

Technology