Spark应用【根据新df更新旧df】

 1       // 主键字段保持不变,再转换回来
 2       var columnMap:Map[String, String] = Map()
 3       for(key <- keysOpt){
 4         columnMap += (key + " AS " + key + "S" -> key)
 5       }
 6       // keysOpt:主键字段构成的数组
 7       var columnBackMap:Map[String, String] = Map()
 8       for(key <- columnName){
 9         if(!keysOpt.contains(key)){// 只对非主键字段做处理
10           columnBackMap += (key + "S" -> key)
11         }
12       }
13       val convertion = columnName.map(key => key + " AS " + key + "S") // columnName:包含该df中所有的字段名称
14       val df1_plus = df1.selectExpr(convertion.map(t => columnMap.getOrElse(t, t)): _*) // 修改df1除主键之外的字段,末尾加S
15       //df关联
16       val df3 = df1_plus.join(df2, keysOpt)
17 
18       val df4 = df3.select(columnName.map(c => df2(c)): _*)
19       // 重新获取df3中属于df1的字段
20       val df1_column_back = df1_plus.columns
21       val df5 = df3.select(df1_column_back.map(c => df1_plus(c)): _*)
22       // 把转换的字段名称再转换回来
23       val df5_plus = df1.selectExpr(df1_column_back.map(t => columnBackMap.getOrElse(t, t)): _*) // 去掉末尾之前添加的S
24       // 合并
25       val union_Data = df4.union(df5_plus)

结果:

  在非主键字段名称末尾添加S

  去掉非主键字段名称末尾添加的S

原文地址:https://www.cnblogs.com/yszd/p/10144675.html