spark dataframe 转换为json

调用spark 的隐式转换

scala> import spark.implicits._
import spark.implicits._

scala> val df=sc.parallelize(Array(("a",1),("b",2),("c",3)))

scala> df.show()
+---+---+                                                                       
| _1| _2|
+---+---+
|  a|  1|
|  b|  2|
|  c|  3|
+---+---+

scala> val jsonStr=df.toJSON.collect()
jsonStr: Array[String] = Array({"_1":"a","_2":1}, {"_1":"b","_2":2}, {"_1":"c","_2":3})

使用scala JSON方法

从dataframe 转换为Array

scala> df.show()
+---+---+                                                                       
| _1| _2|
+---+---+
|  a|  1|
|  b|  2|
|  c|  3|
+---+---+

scala> val dfarr=df.collect().map{case org.apache.spark.sql.Row(x:String,y:Int)=>(x,y)}
dfarr: Array[(String, Int)] = Array((a,1), (b,2), (c,3))

从Array转换成JSONObject

scala> val jsonData:Array[JSONObject] = dfarr.map{ i =>JSONObject(Map(i._1 -> i._2))}
jsonData: Array[scala.util.parsing.json.JSONObject] = Array({"a" : 1}, {"b" : 2}, {"c" : 3})

scala> 

从JSONObject转换为JSONArray

scala> val jsonArray:JSONArray=new JSONArray(jsonData.toList)
jsonArray: scala.util.parsing.json.JSONArray = [{"a" : 1}, {"b" : 2}, {"c" : 3}]
原文地址:https://www.cnblogs.com/Jaryer/p/13667571.html