当前位置:网站首页>Spark FAQs
Spark FAQs
2022-07-22 20:50:00 【roo_ one】
Catalog
RDD What are the aspects of elasticity
Reference resources 1:
RDD The elasticity of is reflected in the calculation , When Spark When calculating , Data loss or failure occurs at a certain stage , Can pass RDD The blood relationship can be repaired .
1、 Memory elasticity : Automatic switch between memory and disk
2、 The resilience of fault tolerance : Data loss can be recovered automatically
3、 Elasticity of calculation : Calculation error retrial mechanism
4、 The elasticity of slices : Regroup as needed
Reference resources 2:
1. Automatically switch between memory and disk
2. be based on lineage High efficient fault tolerance of
3.task If it fails, it will retry a certain number of times
4.stage If it fails, it will automatically retry a certain number of times , And only the failed fragments will be calculated
5.checkpoint【 Every time the RDD Operation will produce new RDD, If the chain is long , The calculation is cumbersome , Just put the data in the hard disk 】 and persist 【 Reuse data in memory or disk 】( checkpoint 、 Persistence )
6. Data scheduling flexibility :DAG TASK It has nothing to do with resource management
7. High elasticity of data fragmentation repartion
Reference resources :RDD What are the aspects of elasticity
Spark Master/Driver Will save RDD Upper Transformations. thus , If a RDD The loss of ( That is to say salves Break down ), It can be quickly and easily transferred to the surviving hosts in the cluster . This is the same. RDD The elasticity of .
RDD There's a dependency , It can be traced back to . Build into DAG, This DAG Will create many stages , These stages are called stage,RDDstage There will be dependencies between , Later, we will build according to the previous dependencies , If the previous data is lost , It will remember the previous dependencies , Restore from the front . Each operator produces a new RDD.spark Medium DAG Namely rdd Internal transformation relationship , These transformation relationships will be transformed into dependencies , Then it is divided into different stages , So as to describe the sequence of tasks .
The most complete in history spark Interview questions
Reference resources : The most complete in history spark Interview questions
Spark common 20 One interview question ( Contains most of the answers )
Spark Frequently asked questions , Q & a type
spark Operation process
边栏推荐
- 1045 favorite color stripe (30 points)
- Binary search (recursive function)
- A new checkerboard placement and sizing method for capacitors in charge scaling DAC based on nonlinear worst-case analysis
- 面试官:生成订单30分钟未支付,则自动取消,该怎么实现?
- 面向高性能计算场景的存储系统解决方案
- 【面试:基础篇02:冒泡排序】
- 弱网测试(Charles模拟)
- 安装pycharm
- 1072 gas station (30 points)
- 项目中手机、姓名、身份证信息等在日志和响应数据中脱敏操作
猜你喜欢
八大排序(直接插入排序)
postman接口测试
Common centroid capacitor layout generation considering device matching and parasitic minimization
【FPGA】:ip核---乘法器(multiplier)
Leetcode high frequency question: what is the difference between the combat effectiveness of the two soldiers with the closest combat effectiveness of N soldiers in the challenge arena
信号耦合约束下扭曲共质心电容器阵列的布线能力
面向高性能计算场景的存储系统解决方案
鏈棧實現(C語言)
LeetCode高频题:擂台赛n名战士战斗力最接近的两名战士,战斗力之差为多少
selenium测试框架快速搭建(ui自动化测试)
随机推荐
VIM learning journey
多线程07--ThreadLocal
多线程08--阻塞队列
Lire attentivement le document DETR et analyser la structure du modèle
Simulated student information input interface
线程池01--基础使用
DEFORMABLE DETR 论文精度,并解析网络模型结构
基于非线性最坏情况分析的电荷缩放 DAC 中电容器的新棋盘放置和尺寸调整方法
线程池02--源码
字符串split操作到底有多少坑
面试官:生成订单30分钟未支付,则自动取消,该怎么实现?
Jmeter性能测试
APP专项测试
初次见面 多多关照
PASTEL:电荷再分配 SAR-ADC 中具有广义比率的电容器阵列的寄生匹配驱动布局和布线
测试开发
【keil软件】仿真时如何使用逻辑分析仪查看波形
1080 graduate admission (30 points)
Parasitic sensing size and detailed routing of binary weighted capacitors in charge graded DAC
【FPGA】:ip核--DDR3