当前位置:网站首页>Redis cluster details
Redis cluster details
2022-07-20 21:58:00 【bseayin】
Let's look at a picture , Roughly touch Redis Cluster
Redis Cluster The requirement requires at least 3 individual master To form a cluster , At the same time, each master There needs to be at least one slave node . Between nodes TCP signal communication . When master There was a crash , Redis Cluster The corresponding slave Node promotion to master, To provide services again .
Redis Cluster function : Load balancing , Fail over , Master slave copy .
Load balancing
Let's start with the next slot , Each of the clusters redis Each instance is responsible for taking over part of the slot , The total number of slots is :16384(2^14), If there is 3 platform master, Then each machine is responsible for 5461 Slot (16384/3).
redis node | Responsible slot |
---|---|
node 1 | 0-5461 |
node 2 | 5461-10922 |
node 3 | 10922-16383 |
When redis When the client sets the value , Can take key Conduct CRC16 Algorithm , then Follow 16384 modulus , What you get is which slot you fall in , According to the above table, we can get which node . The slot formula is as follows :
slot = CRC16(key) & 16383
Redis In the cluster , Each node will have other nodes ip, Responsible tank etc. Information .
JedisCluster How to address clusters ?
JedisCluster The configuration only specifies a node in the cluster IP, Port information is ok .JedisCluster On initialization , You will find the configured node to get the information of the whole cluster (cluster nodes command ).
Analyze cluster information , Get all in the cluster master Information , Then traverse each master, adopt ip, Port building jedis example , then put To an overall situation nodes In variables (Map type ) , key by ip, port , The value is Jedis example ,nodes Values are as follows :
nodes={172.19.93.120:[email protected],.....}
Traverse above master In the process , One more thing , Traverse this platform master Responsible slot index , And then put To an overall situation map slots Inside . The value is above Jedis example , slots Values are as follows :
slots={[email protected],
[email protected],
[email protected],
....
5461 = [email protected], #### additional master machine
....
[email protected]}
With the top slots Variable , When there is value set when , I'll figure it out first slot = getCRC16(key)&(16383-1), If so 12182 , And then call slots.get(12182) obtain jedis example , And then to operate redis.
If you find that MovedDataException, It indicates that there is a problem with the corresponding relationship between the initialized slot and the node ,( New nodes or downtime ) It will reset slots.
Communication between cluster machines
There are usually two ways for data information such as cluster machines , One is centralized , such as springcloud The service cluster information is saved in the configuration center . The other way is redis The way ,gossip.
Centralized : The advantage is that , Update and read metadata , Very good timeliness , Once the metadata changes , Immediately update to centralized storage , When other nodes read, they can immediately sense ; The bad thing is , All the pressure of updating metadata is concentrated in one place , May cause pressure on metadata storage .
gossip: The advantage is that , Metadata updates are scattered , Not in one place , Renewal requests will continue , Hit all nodes to update , There is a certain delay , It reduces the pressure ; shortcoming , Metadata update has a delay , Some operations of the cluster may be delayed .
The port of communication is itself redis Listening port +10000 , such as Listening port 6379, The communication port is 16379 .
Gossip The main responsibility is to exchange information . The carrier of information exchange is sent by nodes Gossip news , frequently-used Gossip The message can be divided into :ping
news 、pong
news 、meet
news 、fail
News, etc. .
- meet news : Used to notify new nodes to join . The sender informs the receiver to join the current cluster ,meet After the message communication is completed normally , The receiving node will join the cluster and perform periodic ping、pong The message exchange .
- ping news : The most frequently exchanged messages in the cluster , Each node in the cluster sends messages to multiple other nodes per second ping news , It is used to detect whether nodes are online and exchange status information with each other .ping Message sending encapsulates the status data of its own node and some other nodes .
- pong news : When receiving ping、meet When the news , As a response message, reply to the sender to confirm that the message communicates normally .pong The message encapsulates its own state data . Nodes can also broadcast their messages to the cluster pong Message to inform the whole cluster to update its status .
- fail news : When a node decides that another node in the cluster is offline , It will broadcast a fail news , Other nodes receive fail After the message, update the corresponding node to offline status .
For example, when adding a new node , That is to say Meet Message process
- node A Will be for the node B Create a clusterNode structure , And add the structure to your own clusterState.nodes In the dictionary .
- node A according to CLUSTER MEET The command is given IP Address and port number , To the node B Send a MEET news .
- node B Received node A Sent MEET news , node B Will be for the node A Create a clusterNode structure , And add the structure to your own clusterState.nodes In the dictionary .
- node B To the node A Return a PONG news .
- node A Will be affected by the node B Back to PONG news , Through this article PONG Message node A You can know the node B You have successfully received your own MEET news .
- after , node A To the node B Return a PING news .
- node B The received nodes A Back to PING news , Through this article PING Message node B You can know the node A Has successfully received their own return PONG news , The handshake is complete .
- after , node A The node will be B Information through Gossip The protocol is propagated to other nodes in the cluster , Let other nodes also be associated with nodes B A handshake , Final , After a period of time , node B Will be recognized by all nodes in the cluster .
For example, when a node fails , How to judge offline
Every node in the cluster sends... To other nodes on a regular basis ping command , If you accept ping The node of the message did not reply within the specified time pong, Then send ping The node will accept ping The nodes of are marked as Subjective offline .
If more than half of the primary nodes of the cluster will be primary nodes A Mark as subjective offline , The node A Will be marked as objective offline ( Broadcast through nodes ) That is, offline .
Fail over
When a slave node finds that the master node it is copying has entered the offline state , The following node will initiate failover for the offline master node , Here are the steps to perform failover :
- The slave node performs SLAVEOF no one command , Becomes the new master node ;
- The new master node will cancel all slot assignments to the offline master node , And assign all these slots to yourself ;
- The new master node broadcasts a message to the cluster PONG news , This article PONG The message can let other nodes in the cluster know immediately that this node has changed from a slave node to a master node , And this master node has taken over the slot that was handled by the offline node .
- The new master node begins to receive command requests related to the slot it is responsible for processing , Failover complete .
Master slave copy
Simple steps of master-slave replication
- Two fields are maintained from within the node server , namely masterhost and masterport Field , Used to store the master node ip and port Information .
- slave There's a scheduled task inside , Every time 1s Check for new master To connect and copy , If you find , Just follow master establish socket network connections .
- password authentication - if master Set up requirepass, that salve Must be sent at the same time masterauth Password authentication for
- master For the first time, perform full replication , Send all data to slave .(run id It's different to make a full copy )
- master Orders will continue to be written later , Asynchronously replicate to slave.
The full replication process
- After the master node receives the command of full replication , perform bgsave, Generate in the background RDB file , And use a buffer ( Called replication buffer ) Record all write commands executed from now on .
- The master node bgsave After execution , take RDB The file is sent to the slave node ; From the node first clear their old data , Then load the received RDB file , Update the database status to the master node to execute bgsave The database state at .
- The master node sends all the write commands in the aforementioned copy buffer to the slave node , Execute these write commands from the node , Update the database state to the latest state of the master node .
- If the slave node is turned on AOF, It triggers bgrewriteaof Implementation , To ensure that AOF The file is updated to the latest state of the master node .
Server running ID(runid)
Every Redis node ( Whether it's Master-Slave ), A random will be generated automatically at startup ID( It's not the same every time you start ), from 40 Random hexadecimal characters ;runid Used to uniquely identify a Redis node . adopt info Server command , You can view the runid:
When the master and slave nodes replicate for the first time , The main node will be its own runid Send to slave , Take this from the node runid Save up ; When it's disconnected and reconnected , This node will be runid Send to master ; The master node is based on runid Determine whether full replication can be carried out :
If you save from the node runid With the master node now runid Different , Indicates that the slave node is synchronized before disconnection Redis The node is not the current master node , Make full copies .
An interview question from Tencent
Redis Let's talk about the working principle of cluster mode ? In cluster mode ,key How to address ? What are the algorithms for addressing ? Understanding consistency hash Do you ?
What are the algorithms for addressing
hash Algorithm
according to key Of hash Value and then take the number of modular nodes , hash(key)% Number of nodes .
shortcoming : When the node is down or new , It will cause the number of nodes to change , All data should be recalculated .
redis cluster Of hash slot Algorithm
It has been said that
Uniformity hash Algorithm
Uniformity hash The algorithm uses a method called consistency hash Data structure implementation of ring , The integer distribution range of the ring is ( 0 , 1 , 2 , 3 … 2^32-1 ) , Here's the picture :
Suppose now we have 4 Objects , Respectively o1,o2,o3,o4, Use hash Function to calculate this 4 Object's hash value ( The scope is 0 ~ 2^32-1):
hash(o1) = m1
hash(o2) = m2
hash(o3) = m3
hash(o4) = m4
take m1,m2,m3,m4 Fall in the hash On the ring :
Suppose we have c1,c2,c3 Three machines , Use them respectively ip Address access hash:
hash(c1 Of ip) = t1
hash(c2 Of ip) = t2
hash(c3 Of ip) = t3
take t1,t2,t3 Fall in the hash On the ring :
stay hash Find the distance from the object clockwise on the ring hash The nearest machine , It's the machine that this object belongs to . As shown in the figure above :
- o1[m1] The object falls on t3[c3] On the machine
- o2[m2] The object falls on t1[c1] On the machine
- o3[m3] The object falls on t2[c2] On the machine
- o4[m4] The object falls on t2[c2] On the machine
New machines
Pictured above , We have added c4 machine , It is calculated that hash On the ring t4 Location , Now just reorganize o4 The object falls back on c4 On the machine ok 了 , Other objects are still on the original machine .
Downtime
Pictured above , We c1 It's down. , be o2 It needs to be reorganized to c3 On the machine , Other objects are still on the original machine .
Hash Data skew of the ring
Uniformity Hash When the number of service nodes is too small , It is easy to cause data skew due to uneven node segments ( Most of the cached objects are cached on a certain server ) problem , For example, there are only two servers in the system , The ring distribution is as follows :
At this time, a large number of data will be collected to Node A On , And only a very small number will be able to locate Node B On . To solve this data skew problem , Uniformity Hash The algorithm introduces the virtual node mechanism .
Virtual node
It is to map multiple virtual nodes to real machines , So in hash It seems that there are many machine nodes on the ring . The specific method can be found in the server IP Or add a number after the host name to achieve .
For example, the above situation , You can compute three virtual nodes for each server , So we can calculate “Node A#1”、“Node A#2”、“Node A#3”、“Node B#1”、“Node B#2”、“Node B#3” Hash value of , So six virtual nodes are formed :
At the same time, the data location algorithm remains unchanged , It's just one more step from virtual node to actual node , For example, positioning to “Node A#1”、“Node A#2”、“Node A#3” The data of the three virtual nodes are located at Node A On . This solves the problem of data skew when there are few service nodes . in application , The number of virtual nodes is usually set to 32 Even larger , Therefore, even a few service nodes can achieve relatively uniform data distribution .
More learning materials Please follow the WeChat public account
边栏推荐
- Using redis + Lua script to realize distributed flow restriction
- 2022 Henan Mengxin League game (2): Henan University of technology a - excellent player
- 面试大厂Android开发的准备
- ArrayList源码解析
- Dest0g3 520 orientation -web easyphp
- Essays of this week (sorted out on weekends)
- 北京邮电大学|RIS辅助室内多机器人通信系统的联合深度强化学习
- 模拟实现库函数strstr--查找子字符串
- 2022.07.19 洛谷 P6588 『JROI-1』 向量
- 微信小程序实验案例:简易成语小词典
猜你喜欢
记录一下十三届蓝桥杯嵌入式省赛题目
College student party building website system based on SSH
2022 latest Inner Mongolia construction safety officer simulation question bank and answers
Dest0g3 520 orientation -web easyphp
Quickly install VMware tool for stm32mp157 development board
JSON 格式接口测试流程
会话存储sessionStorage与本地存储localStorage叙述与案例分析
BigDecimal使用不当,造成P0事故!
2022河南萌新联赛第(二)场:河南理工大学 G - 无限
喜讯 | 数睿数据获“2022爱分析·中国低代码最佳实践案例”
随机推荐
索引下推的基本原理
JSON 格式接口测试流程
2022年湖南工学院ACM集训第四次周测题解
如何使用IDE工具HHDBCS,在Oracle数据库中创建一个包含1000条模拟数据的数据表,并将该
【刷题记录】15.三数之和
王者荣耀商城异地多活架构设计
解析创客教育课程设置中的创新思维
使用Redis + lua脚本实现分布式限流
请问Redis 如何实现库存扣减操作和防止被超卖?
reduce的用法
自定义Dialog(包含头尾)
2022河南萌新联赛第(二)场:河南理工大学 L - HPU
走进创客教育课程实践的真实情境
Eolink 和 JMeter 接口测试优势分析
森马做LP的背后,“温州系”正跑步进入创投圈
TS学习(七) :TS的接口与类型兼容
Open the physical space for the construction of maker education courses
喜讯 | 数睿数据获“2022爱分析·中国低代码最佳实践案例”
Simple examples of pointer arrays and array pointers
赴港上市告吹后,土巴兔终止创业板IPO,创始人作出回应