当前位置：网站首页>MySQL foundation - database index and transaction

MySQL foundation - database index and transaction

2022-07-20 19:41:00 【invictusQAQ】

Catalog

1. Indexes

1.1 Concept

Index is a special kind of file , Contains a reference pointer to all records in the data table . You can index one or more columns in a table , And specify the type of index , Each index has its own data structure implementation .（ More complicated , We are here only for understanding ）

1.2 effect

Tables in the database 、 data 、 The relationship between indexes , It's like a book on a shelf 、 The relationship between book content and book catalog . The function of the index is similar to that of a book catalogue , It can be used for quick positioning 、 Retrieving data . Index is very helpful to improve the performance of database .

1.3 Use scenarios

Consider creating an index on a column or columns of a database table , The following points need to be considered ：
1. Large amount of data , And we often make conditional queries on these columns .
2. The insert operation of the database table , And for these columns The frequency of modification operation is low .
3. Index meeting Take up extra disk space . When the above conditions are met , Consider indexing these fields in the table , To improve query efficiency . conversely , If the non conditional query column , Or insert it often 、 Modify the operating , Or when there is not enough disk space , Don't think about creating indexes . （ The index will reduce the efficiency of addition, deletion and modification , Please increase the index carefully ）

Like our primary key （primary key） Because it will be queried frequently, the index is added by default .

1.4 Use

Let's take a brief look at the usage of index ：

Create a primary key constraint （PRIMARY KEY）、 Unique constraint （UNIQUE）、 Foreign key constraints （FOREIGN KEY） when , Automatically created Index of corresponding column .

Look at the index ：

show index from  Table name ;

Create index

For non primary keys 、 Non unique constraint 、 Non foreign key fields , You can create a normal index ：

create index  Index name  on  Table name ( Field name );

Delete index ：

drop index  Index name  on  Table name ;

1.5 matters needing attention

The index will not be used when querying , Low efficiency ：

Of course, the above situation is for us Single index , Next we will mention Joint index .

For example, the index is key index （a,b,c）. Can support a | a,b| a,c| a,b,c Combine to find , But does not support b| c| b,c Search for . Even if you mess up the order, such as c,b,a,mysql It will also automatically help you change to a,b,c, Then use the index . This is it. mysql Left most matching principle , Only the leftmost field of the composite index can be used in the query criteria . When the leftmost field is a constant reference , The index is very efficient .

In addition to the above ten scenarios, the joint index will be invalidated , Not followed Left most matching principle Time will also fail . That is, if you want to use the joint index when querying Must contain the leftmost index field of the given field .

1.6 The data structure of index implementation

The implementation of index mainly includes hash and B+ Trees , because hash We are already familiar with , Using us here is just Brief introduction once B+ Tree index . Here is a simple tree B+ Numerical diagram ：

Of course, the data is written casually , You can understand this structure and characteristics .

characteristic ：
1. Several values are stored on each node , At most, there are several sub trees
2. The elements of the parent node exist in the child node , Is on the child node Maximum / Minimum value
3. The lowest leaf node is represented by Linked list Form connection

B+ The advantages of trees ：
1. Fast query speed , Query similar to binary search tree
2. A single node stores more data , The height of the tree is low , There are fewer comparisons
3. All leaf nodes are connected by linked lists , Easy to scope （ The leaf node contains the complete set of databases ）
4. Each data row only needs to be saved in the leaf node , Non leaf nodes only need to store the data used for indexing id that will do . So non leaf nodes take up very little space , Make non leaf nodes （ Even part of ） Caching in memory becomes possible , Can greatly reduce the disk IO, Improve query efficiency

2. Business

2.1 Why use transactions

Take a very simple example , For example, your friend borrows money from you , You promised , Transfer it to him 500, But due to the system failure of the bank , Your account balance has decreased 500 And your friend didn't receive the money

Solution ： Use transactions to control , Ensure that all the above operations are successfully executed , Or all failed .

2.2 Concept of transactions

A transaction is a logical set of operations , The units that make up this set of operations , All or nothing , All or nothing . In different environments , You can have business . In the database , It's database transactions .

2.3 Use

（1） Open transaction ：start transaction;
（2） Execute more than one SQL sentence
（3） Roll back or commit ：rollback/commit;

start transaction;
--  Zhang San's account decreased 2000
update accout set money=money-2000 where name = ' Zhang San ';
--  Li Si's account increased 2000
update accout set money=money+2000 where name = ' Li Si ';
commit;

explain ：rollback It's all failure ,commit It's all success .

2.4 The basic characteristics of a transaction

1. Atomicity ： most The core Characteristics of , That is, a group of logical operations , The units that make up this set of operations , All or nothing , All or nothing
2. Uniformity ： Keep the data consistent , There are no mistakes
3. persistence ： As long as the transaction is executed successfully , The resulting modification is Persistent protection Deposited （ Save it on disk / In hard disk ）
4. Isolation, ： Describe multiple transactions Parallel execution What happened

We can just remember these points , Next, we will explain in detail the possible problems in parallel execution and the so-called Isolation sex .

2.5 Possible problems in parallel execution

2.5.1 Dirty reading problem

Literally, the data is polluted , The data read is not accurate . A business A In the process of execution , A series of modifications have been made to the data , Before submitting to the database （ Before completing the transaction ）, Another business B, Read the corresponding data , And then B The data read are all temporary results , The follow-up may be immediately A Revised , here B The reading behavior of becomes ” Dirty reading “.
You can simply understand that your classmate copied one of your unsubmitted （commit） The homework , And you find that there are mistakes in your homework , At this time, you will redo your homework （ analogy rollback）, At this time, your classmates' behavior of copying homework is ” Dirty reading “.

In order to solve the problem of dirty reading , We can make an agreement , In the transaction A Cannot be read by other transactions before committing . And this operation is equivalent to locking the read operation , Reduced concurrency and efficiency , At the same time, isolation is increased .

2.5.2 Unrepeatable read problem

Compared with dirty reading , Non repeatable questions refer to Business A After submission , Business B Just started reading （ At this time, the reading is locked ） And then in B During the execution of ,A Once again , Modified the data , here B In the process of execution , The two read operations may be inconsistent , This kind of problem is called unrepeatable .
For example, you are in GitHub I'm reading other people's projects , But the author started to update the project on a whim , You read and refresh, and suddenly find that the data seems to have been changed , This is called unrepeatable reading . The difference from dirty reading is that non repeatable transactions have been committed and modified .

In order to solve the problem of non repeatable reading , We can make an agreement , In the transaction A Cannot be read by other transactions before committing （ Previous locking ） And in Business A It is not allowed to be modified when being read .

2.5.3 The problem of unreal reading

The problem of magic degree is actually a little similar to a special case of non repeatable reading , Non repeatable reads are two read operations in a transaction, and the read results are different （ The same record ）, Phantom reading is read by two read operations in a transaction Result set Dissimilarity .
Similar to what we are doing GitHub When reading others' projects on , Originally only student.java A file , But the author suddenly added one in the process of our reading teacher.java file , This leads us to The result set read is not Same as , This is unreal reading .

The core of solving the problem of unreal reading is ” Serialization “, Speaking human is strictly required to be a business A Cannot perform any operation while being read . That is, the next one can only be executed after one is executed .

2.6 Parallelism and isolation

Why do we need concurrency , because efficiency , And why do we need isolation , because accuracy . These two are often Contrary to . In the actual development scenario, we need to choose according to our needs . See whether the scenario is more sensitive to performance or accuracy . Of course. MySQL There are also four gears for us to choose .

1.read uncommitted Concurrency is the strongest , Isolation is the weakest
2.read committed Only the submitted data can be read , Solved the problem of dirty reading
3.repeatable read There are restrictions on reading and writing , Solved the unrepeatable read problem （ Default this gear ）
4.serializable Strict serial execution , The phantom reading problem was solved , Minimum concurrency , Highest isolation

原网站

版权声明
本文为[invictusQAQ]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/201/202207191724500016.html