当前位置:网站首页>使用反射的方式将RDD转换为DataFrame
使用反射的方式将RDD转换为DataFrame
2022-07-20 05:31:00 【菜鸟也有梦想啊】
Java:
package cn.spark.sql; import org.apache.spark.SparkConf; import org.apache.spark.api.java.JavaRDD; import org.apache.spark.api.java.JavaSparkContext; import org.apache.spark.api.java.function.Function; import org.apache.spark.sql.DataFrame; import org.apache.spark.sql.Row; import org.apache.spark.sql.SQLContext; import java.util.List; public class RDD2DataFrameReflection { public static void main(String[] args){ SparkConf conf = new SparkConf().setAppName("RDD2DataFrameReflection") .setMaster("local"); JavaSparkContext sc = new JavaSparkContext(conf); SQLContext sqlContext = new SQLContext(sc); JavaRDD<String> lines = sc.textFile("C://Users//Desktop//students.txt"); JavaRDD<Student> students = lines.map( new Function<String, Student>() { private static final long serialVersionUID = 1L; @Override public Student call(String s) throws Exception { String[] lineSplit = s.split(","); Student stu = new Student(); stu.setID(Integer.valueOf(lineSplit[0].trim())); stu.setName(lineSplit[1]); stu.setAge(Integer.valueOf(lineSplit[2].trim())); return stu; } }); /* * 使用反射方式 将RDD转换为DataFrame * 将Student.class 传入进去 其实就是用反射的方式来创建DataFrame * 因为底层通过Student class 进行反射 来获取其中的field * 要求 javabean中需要实现serializable 接口 * */ DataFrame studentDF = sqlContext.createDataFrame( students,Student.class); //拿到一个DataFrame 后 就可以注册一个临时表 studentDF.registerTempTable("students"); DataFrame teengerDF = sqlContext.sql("select * from students where age <= 18"); //将查询出来的DataFrame 再次转换成RDD JavaRDD<Row> teengerRDD = teengerDF.javaRDD(); //将EDD中的数据 进行映射 映射为student JavaRDD<Student> teengerStudentRDD = teengerRDD.map(new Function<Row, Student>() { @Override public Student call(Row row) throws Exception { Student stu = new Student(); stu.setAge(row.getInt(0)); stu.setID(row.getInt(1)); stu.setName(row.getString(2)); return stu; } }); List<Student> studentList = teengerStudentRDD.collect(); for (Student stu : studentList){ System.out.println(stu); } } }
测试:
边栏推荐
- [dish of learning notes, dog learning C] getting to know the pointer for the first time
- QT quick 3D physics in QT 6.4
- Niuke bm6 judges whether there is a ring in the linked list
- The practical operation of multi category risk scoring data helps you stabilize your small, medium and micro businesses
- SIGIR‘22 推荐系统论文之对比学习篇
- 【学习笔记之菜Dog学C】初识指针
- 【学习笔记之菜Dog学C】详解操作符
- ES6中Symbol、迭代器和生成器基本语法
- js事件流 (捕获阶段、目标阶段、冒泡阶段)取消浏览器的默认冒泡行为
- JDBC 学习笔记
猜你喜欢
【学习笔记之菜Dog学C】数组
easyExcel设置最后一行的样式【可以拓展为每一行】
05—— mvvm 模型
[dish of learning notes, dog learning C] getting to know the pointer for the first time
dom——操作文档树及其案例
js事件流 (捕获阶段、目标阶段、冒泡阶段)取消浏览器的默认冒泡行为
How to realize file sharing access on computer in win10
背包问题(01背包/完全背包解释)
【C】 Introduction to C language
【C】 C语言入门
随机推荐
抽象类和接口的区别
[dish of learning notes dog learning C] advanced pointer
Asynchronous processing of readfile blocking
ES6新增二(字符串,数组)
【学习笔记之菜Dog学C】扫雷游戏
Visual Studio 开发环境的配置
【学习笔记之菜Dog学C】链式访问、函数的声明和定义、goto语句
试题 B: 顺子日期
(第十三届蓝桥杯)试题 A: 九进制转十进制
直接插入排序/希尔排序
试题 C: 刷题统计
[dish of learning notes, dog learning C] getting to know the pointer for the first time
05—— mvvm 模型
[dish of learning notes dog learning C] detailed array name
[dish of learning notes dog learning C] data storage
A. Log Chopping
ES6 added two (string, array)
How to realize file sharing access on computer in win10
使用swiper4平滑纵向无间隙滚动,鼠标点击或拖动后,动画未全部完成,鼠标移出 自动轮播失效,以及动态渲染数据,动画紊乱
win10如何实现电脑上文件共享访问