当前位置:网站首页>Opengauss kernel analysis: query rewriting
Opengauss kernel analysis: query rewriting
2022-07-21 08:50:00 【Huawei cloud developer Alliance】
Abstract : Query rewriting optimization can be based on the theory of relational algebra , It can also be optimized based on heuristic rules .
This article is shared from Huawei cloud community 《openGauss Kernel analysis ( Four ): Query rewriting 》, author : handsome and stern young fellow .
Query rewriting
SQL Language is rich and diverse , Very flexible , Different developers have different experiences , Handwritten SQL Sentences are also various , In addition, it can be automatically generated by tools .SQL Language is a descriptive language , The user of the database just describes the desired result , It doesn't care about how the data is obtained , Enter the SQL It is difficult for language to be expressed in an optimal form , It often implies some redundant information , This information can be mined to generate more efficient SQL sentence . Query rewriting is to put the user's input SQL Statement to a more efficient equivalent SQL, Query rewriting follows two basic principles .
• Equivalence : The original statement and the rewritten statement , The output is the same .
• Efficiency : The rewritten statement , It is more efficient in execution time and resource use than the original statement .
Query rewriting optimization can be based on the theory of relational algebra , For example, predicate push down 、 Sub query optimization, etc , It can also be optimized based on heuristic rules , for example Outer Join eliminate 、 Table connection elimination, etc . Query rewriting is a rule-based logic optimization .
At the code level , The schema of query rewriting is as follows :
The following external connections are eliminated Outer2Inner— The process of query rewriting is analyzed by taking the conversion from external connection to internal connection as an example : stay left outer join perhaps right outer join in , If there is a query condition that can logically contain IS NOT NULL, for example c1 > 0, You can convert a query to INNER JOIN, Thus, the intermediate result set produced by association processing is reduced
External connection elimination Outer2Inner
The following is an example to illustrate the differences between various multi table connection methods
create table t1(c1 int, c2 int);create table t2(c1 int, c2 int);insert into t1 values(1, 10);insert into t1 values(2, 20);insert into t1 values(3, 30);insert into t2 values(1, 100);insert into t2 values(3, 300);insert into t2 values(5, 500);
Internal connection inner join: Returns the combination that both tables satisfy , It is equivalent to taking the intersection of two tables
SELECT * FROM t1 inner JOIN t2 ON t1.c1 = t2.c1;
Left connection left outer join: Returns all rows in the left table , If the row in the left table does not match the row in the right table , Then the columns in the right table in the result return to null values
SELECT * FROM t1 Left OUTER JOIN t2 ON t1.c1 = t2.c1;
The right connection right outer join: Returns all the rows in the right table , If the row in the right table does not match the row in the left table , The columns in the left table in the result return to null values
SELECT * FROM t1 right OUTER JOIN t2 ON t1.c1 = t2.c1;
Full connection full join: Returns all rows in the left and right tables . When a row has no matching row in another table , Then the column in the other table returns to null , It is equivalent to taking two tables and combining them
SELECT * FROM t1 full JOIN t2 ON t1.c1 = t2.c1;
On the basis of the above experiments, add t2 Tabular where Conditions
left join and inner join The result is the same , This is because the query criteria contain WHERE t2.c2 >100 This condition ,t2 All mismatched tuples in the table are filtered out ( Include null value ), Therefore, query transformation can be performed left-outer join -> inner join, Can effectively reduce t1 and t2 The result set produced by the association , Achieve the purpose of performance improvement .
stay openGauss Database system ,subquery_planner Will traverse the query tree rtable, See if there is RTE_JOIN Node of type exists , Set up hasOuterJoins Marker quantity , To enter into reduce_outer_joins Interface , When the external connection elimination conditions are met, the external connection elimination can be performed . reduce_outer_Joins Two actions are performed inside the function ,(1)reduce_outer_joins_pass1 Pre inspection , It's inspection jointree Whether there are external links in , And some information about reference tables , For action 2 Prepare for information collection , Key reference data structures reduce_outer_joins_state;(2)reduce_outer_joins_pass2 Really complete the elimination of external links .
void reduce_outer_joins(PlannerInfo* root){reduce_outer_joins_state* state = NULL;state = reduce_outer_joins_pass1((Node*)root->parse->jointree);/* planner.c shouldn't have called me if no outer joins */if (state == NULL || !state->contains_outer)ereport(ERROR,(errmodule(MOD_OPT),errcode(ERRCODE_OPTIMIZER_INCONSISTENT_STATE),(errmsg("so where are the outer joins?"))));reduce_outer_joins_pass2((Node*)root->parse->jointree, state, root, NULL, NIL, NIL);}
Using the analysis method of the previous issue , You can get the query tree memory structure ( Query tree Query In the structure targetList Store the target attribute semantic analysis results ,rtable Storage FROM The range table generated by clause ,jointree Of quals Field storage WHERE Expression tree for clause semantic analysis )
contrast reduce_outer_joins Run the pre and post query tree ,jointree and rtable Medium jointype All by join_left Convert to join_inner, That is, the external connection has been changed to the internal connection
(gdb) p *((JoinExpr*)(parse->jointree->fromlist->head.data->ptr_value))$1 = {type = T_JoinExpr, jointype = JOIN_INNER, isNatural = false, larg = 0x7fdfb345cd08, rarg = 0x7fdfb345e2e8, usingClause = 0x0, quals = 0x7fdfb2f0b8a8, alias = 0x0, rtindex = 3}(gdb) p *(RangeTblEntry*)(parse->rtable->tail.data->ptr_value)$2 = {type = T_RangeTblEntry, rtekind = RTE_JOIN, relname = 0x0, partAttrNum = 0x0, relid = 0, partitionOid = 0, isContainPartition = false, subpartitionOid = 0, isContainSubPartition = false,refSynOid = 0, partid_list = 0x0, relkind = 0 '\000', isResultRel = false, tablesample = 0x0, timecapsule = 0x0, ispartrel = false, ignoreResetRelid = false, subquery = 0x0, security_barrier = false,jointype = JOIN_INNER, …}
Click to follow , The first time to learn about Huawei's new cloud technology ~
边栏推荐
- 【upload靶场17-21】二次渲染、条件竞争、黑白名单绕过
- 基于CLIP的色情图片识别;油管最新ML课程大合集;交互式编写shell管道;机器人仓库环境增量感知数据集;最新AI论文 | ShowMeAI资讯日报
- 行业现状令人失望,工作之后我又回到UC伯克利读博了
- MATLAB basic grammar (I)
- openGauss内核分析:查询重写
- This ide plug-in 3.0 makes you the most security aware programmer in the company
- How to create your own NFT?
- 关于let变量提升的问题
- uniapp中登录和支付整理
- Leetcode 322 coin change, 找零钱的最小张数
猜你喜欢
实战演练升级!创宇安全托管,助您定向爆破防守难题
芯片卖到沙子价:雷军的梦想,让这家公司「糟蹋了」
探究路径寻找问题BFS结点的判重方法
Rsync combined with inotify to realize real-time file synchronization (I)
Sorting and retrieval (merging / quick sorting / bisection)
网易游戏 Flink SQL 平台化实践
全网追杀“钱包刺客”
去河南投资,VC很犹豫
How many months did you write your first SCI?
Design microservice security architecture
随机推荐
Bubble sort and selection sort
mysql 在字符串中第n次出现的位置
HJ18 识别有效的IP地址和掩码并进行分类统计
Redis - detailed explanation of redis cli management tool
openGauss内核分析:查询重写
Netease game Flink SQL platform practice
四.uni-app组件[视图组件、基本内容(官方自带例如表单类)、UI组件库、组件库的坑]
Scala 高阶(七):集合内容汇总(上篇)
Window进入别的目录
MySQL基础篇(概念、常用指令)
docker清理缓存脚本怎么写
windows安全加固--关闭非必要端口
全网追杀“钱包刺客”
自定义类型
Nacos手摸手教学【一】Nacos动态配置
Hj17 coordinate movement
HJ76 尼科彻斯定理
7天交付沈阳方舱医院项目,这就是鸿雁速度
The fathers of programming languages are too bored to retire and choose to return to the workplace
淘宝/天猫获取淘宝直播分类id接口 API 返回值说明