当前位置:网站首页>Usage and examples of Apache Doris binlog load
Usage and examples of Apache Doris binlog load
2022-07-21 03:03:00 【Zhangjiafeng】
1. Installation configuration Mysql
install Mysql
Quick to use Docker Installation configuration Mysql, Refer to the following connection for details
https://segmentfault.com/a/1190000021523570
If you are installing on a physical machine, you can refer to the following connection :
stay CentOS 7 Install in MySQL 8 Detailed explanation of the course
Turn on Mysql binlog
Get into Docker Modify on container or physical machine /etc/my.cnf file , stay [mysqld] Add the following below ,
log_bin=mysql_binbinlog-format=Rowserver-id=1
And then restart Mysql
systemctl restart mysqld
establish Mysql surface
create database demo; CREATE TABLE `test_cdc` ( `id` int NOT NULL AUTO_INCREMENT, `sex` TINYINT(1) DEFAULT NULL, `name` varchar(20) DEFAULT NULL, `address` varchar(255) DEFAULT NULL, PRIMARY KEY (`id`) ) ENGINE=InnoDB
2. Installation configuration Canal
download canal-1.1.5: https://github.com/alibaba/canal/releases/download/canal-1.1.5/canal.deployer-1.1.5.tar.gz
decompression Canal To the specified directory :
tar zxvf canal.deployer-1.1.5.tar.gz -C ./canal
stay conf Create a new directory under the folder and rename , As instance Root directory , You can name the directory name yourself for easy identification
For example, my name here is consistent with my database name :demo
vi conf/demo/instance.properties
Here is my example configuration :
Please refer to... For parameter description Canal Official documents :QuickStart
################################################### mysql serverId , v1.0.26+ will autoGencanal.instance.mysql.slaveId=12115 # enable gtid use true/falsecanal.instance.gtidon=false # position infocanal.instance.master.address=10.220.146.11:3306canal.instance.master.journal.name=canal.instance.master.position=canal.instance.master.timestamp=canal.instance.master.gtid= # rds oss binlogcanal.instance.rds.accesskey=canal.instance.rds.secretkey=canal.instance.rds.instanceId= # table meta tsdb infocanal.instance.tsdb.enable=true#canal.instance.tsdb.url=jdbc:mysql://127.0.0.1:3306/canal_tsdb#canal.instance.tsdb.dbUsername=canal#canal.instance.tsdb.dbPassword=canal #canal.instance.standby.address =#canal.instance.standby.journal.name =#canal.instance.standby.position =#canal.instance.standby.timestamp =#canal.instance.standby.gtid= # username/passwordcanal.instance.dbUsername=zhangfengcanal.instance.dbPassword=zhangfeng800729)(*Qcanal.instance.connectionCharset = UTF-8# enable druid Decrypt database passwordcanal.instance.enableDruid=false#canal.instance.pwdPublicKey=MFwwDQYJKoZIhvcNAQEBBQADSwAwSAJBALK4BUxdDltRRE5/zXpVEVPUgunvscYFtEip3pmLlhrWpacX7y7GCMo2/JM6LeHmiiNdH1FWgGCpUfircSwlWKUCAwEAAQ== # table regexcanal.instance.filter.regex=demo\\..*# table black regexcanal.instance.filter.black.regex=# table field filter(format: schema1.tableName1:field1/field2,schema2.tableName2:field1/field2)#canal.instance.filter.field=test1.t_product:id/subject/keywords,test2.t_company:id/name/contact/ch# table field black filter(format: schema1.tableName1:field1/field2,schema2.tableName2:field1/field2)#canal.instance.filter.black.field=test1.t_product:subject/product_image,test2.t_company:id/name/contact/ch # mq config#canal.mq.topic=# dynamic topic route by schema or table regex#canal.mq.dynamicTopic=mytest1.user,mytest2\\..*,.*\\..*#canal.mq.partition=0# hash partition config#canal.mq.partitionsNum=3#canal.mq.partitionHash=test.table:id^name,.*\\..*#################################################
start-up Canal
sh bin/startup.sh
Be careful :canal instance user/passwd
1.1.5 edition , stay canal.properties Add these two configurations
canal.user = canal canal.passwd = E3619321C1A937C46A0D8BD1DAC39F93B27D4458
The default password is canal/canal,canal.passwd The password value of can be through select password(“xxx”) To get
Verify that startup succeeded
tail -200f logs/demo/demo.log
3. Start synchronizing data
3.1 establish Doris Target table
Users need to be first in Doris End creation is good with Mysql End corresponding target table
Binlog Load Can only support Unique Target table of type , And the target table must be activated Batch Delete function .
Turn on Batch Delete We can refer to help alter table
Batch deletion function in .
CREATE TABLE `doris_mysql_binlog_demo` ( `id` int NOT NULL, `sex` TINYINT(1), `name` varchar(20), `address` varchar(255) ) ENGINE=OLAPUNIQUE KEY(`id`,sex)COMMENT "OLAP"DISTRIBUTED BY HASH(`sex`) BUCKETS 1PROPERTIES ("replication_allocation" = "tag.location.default: 3","in_memory" = "false","storage_format" = "V2");-- enable batch deleteALTER TABLE test_2.doris_mysql_binlog_demo ENABLE FEATURE "BATCH_DELETE";
3.1 Create a synchronization job
3.1.1 Create Sync Job Syntax description
Name: ‘CREATE SYNC JOB’ Description:
Data synchronization (Sync Job) function , Support users to submit a resident data synchronization job , By reading... From the specified remote address Binlog journal , Incremental synchronization user at Mysql The of a database to data update operations CDC(Change Data Capture) function .
At present, data synchronization only supports docking Canal, from Canal Server Get the parsed Binlog data , Import to Doris Inside .
User access SHOW SYNC JOB
View data synchronization job status .
grammar :
CREATE SYNC [db.]job_name ( channel_desc, channel_desc ... )binlog_desc
job_name
Synchronization job name , Is the unique identification of the job in the current database , identical
job_name
Only one job can be running .channel_desc
Data channel under job , Used to describe the mysql Source table to doris Mapping relationship of target table .
grammar :
FROM mysql_db.src_tbl INTO des_tbl [partitions] [columns_mapping]
mysql_db.src_tbl
Appoint mysql End database and source table .
des_tbl
Appoint doris End target table , Only support Unique surface , And you need to open the table batch delete function ( For the opening method, please see help alter table Of ’ Batch delete function ’).
partitions
Specify which tables to import to partition in . If you don't specify , It will be automatically imported into the corresponding partition in .
Example :
PARTITION(p1, p2, p3)
column_mapping
Appoint mysql Source table and doris The mapping relationship between the columns of the target table . If you don't specify ,FE By default, the columns of the source table and the target table correspond one by one in order .
I won't support it col_name = expr Represents the column in the form of .
Example :
Suppose the target table is listed as (k1, k2, v1), Change Columns k1 and k2 The order of COLUMNS(k2, k1, v1) Ignore the fourth column of the source data COLUMNS(k2, k1, v1, dummy_column)
binlog_desc
Used to describe remote data sources , Currently only supported canal A kind of .
grammar :
FROM BINLOG ( "key1" = "value1", "key2" = "value2" )
Canal Attributes corresponding to the data source , With
canal.
The prefix- canal.server.ip: canal server The address of
- canal.server.port: canal server The port of
- canal.destination: instance The logo of
- canal.batchSize: Acquired batch The maximum size , Default 8192
- canal.username: instance Username
- canal.password: instance Password
- canal.debug: Optional , Set to true when , Will batch And the details of each row of data are printed Examples:
Simple as
test_db
Oftest_tbl
Create a file calledjob1
Data synchronization job , Connect local Canal The server , Corresponding Mysql Source tablemysql_db1.tbl1
.CREATE SYNC `test_db`.`job1` ( FROM `mysql_db1`.`tbl1` INTO `test_tbl ` ) FROM BINLOG ( "type" = "canal", "canal.server.ip" = "127.0.0.1", "canal.server.port" = "11111", "canal.destination" = "example", "canal.username" = "", "canal.password" = "" );
by
test_db
Create a table namedjob1
Data synchronization job , One by one corresponds to multiple sheets Mysql Source table , And explicitly specify the column mapping .CREATE SYNC `test_db`.`job1` ( FROM `mysql_db`.`t1` INTO `test1` COLUMNS(k1, k2, v1) PARTITIONS (p1, p2), FROM `mysql_db`.`t2` INTO `test2` COLUMNS(k3, k4, v2) PARTITION p1 ) FROM BINLOG ( "type" = "canal", "canal.server.ip" = "xx.xxx.xxx.xx", "canal.server.port" = "12111", "canal.destination" = "example", "canal.username" = "username", "canal.password" = "password" );
3.1.2 Start syncing mysql The data in the table to Doris
Be careful :
Before creating a synchronization task , In the first fe.conf Internal configuration enable_create_sync_job=true, The default is false Is not enabled , Otherwise, you cannot create a synchronization task
CREATE SYNC test_2.doris_mysql_binlog_demo_job ( FROM demo.test_cdc INTO doris_mysql_binlog_demo) FROM BINLOG ( "type" = "canal", "canal.server.ip" = "10.220.146.10", "canal.server.port" = "11111", "canal.destination" = "demo", "canal.username" = "canal", "canal.password" = "canal");
3.1.3 View synchronization task
SHOW SYNC JOB from test_2;
3.1.4 Look at the data in the table
select * from doris_mysql_binlog_demo;
3.1.5 Delete data
We are Mysql Delete data from the data table , And then look Doris Changes inside and outside
delete from test_cdc where id in (12,13)
We're going to see Doris table ,id yes 12,13 These two pieces of data have been deleted
3.1.6 Multi meter synchronization
Multi table synchronization only needs to be written as follows
CREATE SYNC test_2.doris_mysql_binlog_demo_job ( FROM demo.test_cdc INTO doris_mysql_binlog_demo, FROM demo.test_cdc_1 INTO doris_mysql_binlog_demo, FROM demo.test_cdc_2 INTO doris_mysql_binlog_demo, FROM demo.test_cdc_3 INTO doris_mysql_binlog_demo)
边栏推荐
- cmake基本语法以及实战项目分析
- AI2 (APP inventor 2) offline version
- [scientific literature measurement] statistics and visualization of word number and frequency of Chinese and English literature titles and abstracts
- DOM之事件对象
- 电路板调试
- [scientific literature measurement] analysis and visualization of readability indicators of Chinese and English literature titles and abstracts
- HMS core machine learning service creates a new "sound" state of simultaneous interpreting translation, and AI makes international exchanges smoother
- [Muduo] build project compile cmake file and noncopyable
- Silicon Valley classroom notes (Part 1)
- Dictionary preface - company naming
猜你喜欢
Understand and apply continuous deployment Argo CD
How to protect user privacy without password authentication?
Technical dry goods | solve 80% of the problems in the interview, and realize auc/roc based on mindspire
硅谷课堂笔记(上)
Lombok详细介绍
How does HMS core security detection service help freshmen prevent Telecom fraud?
Practice of online problem feedback module (VIII): realize image upload function (Part 1)
J9数字平台论:元宇宙中DeFi的可能性和局限性
[AD learning record] Why are schematic diagrams and PCBs in the same folder, and PCB cannot be generated?
How should enterprise users choose aiops or APM?
随机推荐
[scientific literature measurement] analysis and visualization of readability indicators of Chinese and English literature titles and abstracts
DNP3 模拟器使用教程
Ilrunitme foreach has GC
30 spark introduction: Spark Technology stack explanation, partition, system architecture, operator and task submission method
Chinese anchors can also bring goods overseas! Simultaneous interpretation helps live broadcast applications expand overseas markets
混淆矩阵的计算方式
【科学文献计量】中英文文献标题及摘要分词字数与频数统计与可视化
Apache Doris 使用 Prometheus Alertmanager 模块发送 异常信息至钉钉报警群
开发中常见环境配置名词-dev、sit、pro、fac等
[wechat applet] text field input with maximum word limit (1/100)
Densenet learning notes (core vs. RESNET):
DNS域名解析过程剖析
Illustration leetcode - 731 My schedule II (difficulty: medium)
[Muduo log system 2] timestamp time
Vivo official website app full model UI adaptation scheme
STL list构造函数、大小
Apprentissage et utilisation de websocket
Partial voice feature recording
Monomer or microservice? You think it's an architectural tradeoff? It's actually a cognitive load!
In depth understanding of MySQL execution process and execution sequence