当前位置:网站首页>Kettle optimization to improve MySQL read and write speed
Kettle optimization to improve MySQL read and write speed
2022-07-21 04:39:00 【Drinking wine under the lone shadow moon】
1. background
background : stay mysql Data migration in the database ( Migration between tables , Field conversion ), When you find a large amount of data in the table ,kettle The speed of reading and writing tables is very slow , And the speed will be particularly slow after reaching a certain amount of data , When the amount of data is small, the reading speed is very fast .
2.kettle Perform speed measurement
2.1 Read and write data speed test
5000 Data reading and writing speed test :8min24s namely 504s
20000 Data reading and writing speed test : 16min32s namely 992s
200000 Data reading and writing speed test , The task fails directly , Can't run .
3. To optimize the direction
3.1 Modify database connection parameters
Open the database connection of table input and table output and add the following parameters : Table input or table output => Edit database connection => Options
Named parameters | value | Parameter description |
---|---|---|
useCompression | true | Turn on data compression during transmission , Improve transmission efficiency |
rewriteBatchedStatements | true | Rearrange the database Insert sentence , Merge multiple insert statements into one , Submit insertion efficiency |
useServerPrepStmts | false | Rearrange the database Insert sentence , Merge multiple insert statements into one , Submit insertion efficiency |
5000 Read and write results of pieces of data :3.1s
20000 Data reading and writing speed test : 11.6s
200000 Data reading and writing speed test : 2min18s namely 138s
After the modification of database parameters , Reading and writing efficiency has been significantly improved 86% above .
3.2 kettle modify spoon.bat Running memory size of ( On the basis of modifying data parameters )
stay kettle Edit under the installation directory Spoon.bat file , Find the following parameters :
# The default value is
if "%PENTAHO_DI_JAVA_OPTIONS%"=="" set PENTAHO_DI_JAVA_OPTIONS="-Xms1024m" "-Xmx2048m" "-XX:MaxPermSize=256m"
# Modified value ( Modify according to your computer configuration )
"%PENTAHO_DI_JAVA_OPTIONS%"=="" set PENTAHO_DI_JAVA_OPTIONS= "-Xmx4096m" "-XX:MaxPermSize=4096m"
# Parameters,
# "-Xms4096m": To configure java The size of the initial memory allocation in the virtual machine heap area Not configured
# "-Xmx4096m": To configure java Maximum amount of virtual machine heap memory that can be allocated
# "-XX:MaxPermSize=4096m": To configure java The maximum limit of memory allocated to non heap areas
Be careful :
Development process , Usually will -Xms And -Xmx The configuration of the two parameters is the same , The aim is to be able to java After the garbage collection mechanism cleans up the heap area, it does not need to re separate and calculate the size of the heap area to waste resources . Before configuration, you must carefully consider the size of non heap memory required by your software , Because the memory here will not be java Where the garbage collection mechanism handles it . And more importantly, the sum of the maximum heap memory and the maximum non heap memory must not exceed the available memory of the operating system .
The number of submitted records is changed to 10000, That is, each submission 10000 Bar record
5000 Read and write results of pieces of data :3.2s
20000 Data reading and writing speed test : 10.6s
200000 Data reading and writing speed test : 2min1s namely 121s
Compare the results , With sufficient resources , The connection parameters of the database have a great impact on the reading speed .
3. The effect of optimization ( Read 840 Time consuming of 10000 data )
8488225 Data reading and writing speed test : 1h 55min 10s namely 6910s
4. summary
Use kettle You must pay attention to :
1、 Check the driver version of the connection database , To do it jar The package version and database version should be as close as possible ;
2、 Create a database connection url The following two parameters must be added to :
?autoReconnect=true&failOverReadOnly=false
3、 The following three parameters must be added to the parameters for creating a database connection :( Used to increase data writing speed )
useServerPrepStmts=false
rewriteBatchedStatements=true
useCompression=true
4、 Create database connection , If the connection pool used , You have to set maxIdl It's worth more :
5、 Must be based on kettle The hardware where the software resides , modify spoon.bat(window) or spoon.sh(linux) The memory size of :
边栏推荐
猜你喜欢
JS基础--Object静态方法
静态路由的综合实验
Douban score 9.4! Study "neural network and deep learning" with Professor Qiu Xipeng, and the class starts on July 19
Become a blogger in a year? To! Struggling oneself
静态路由—综合实验
End, to my college life
JS笔试题--Promise,setTimeout,任务队列综合题
Configuration of static routes to achieve network wide accessibility
What are the serious consequences of the Internet giant's "winner takes all"?
Concepts de base de la langue C - petites connaissances une fois par jour
随机推荐
Quickly understand shell scripts
系统安全及应用
Back to the origin: start learning again
落幕,致我的大学生活
testApi(node写测试接口无需数据库,有基础数据)
静态路由的综合实验
英语入门笔记
Game improvement of smart people: Lesson 2 of Chapter 3: k-number multiplication (Cheng)
Conditional judgment if statement case statement
安装和管理程序
The difference between TCP and UDP
手动安装Apache
Guidance process and service control
router-link打开新页面跳转和a标签防止默认跳转及各个属性
JS基础--Math
Dynamic routing protocol rip experiment
将列表中的字符以逗号分割的形式提取出来
JS面试题--ES5和ES6有什么区别?
File system and log analysis
LVM与磁盘配额