spark countByKey && countByvalue

countByKey 和 countByValue都是 action算子 ,结果集均在driver端,输出时不需要单独做collect

spark.sparkContext.setLogLevel("error")

    val bd=spark.sparkContext.parallelize(List(("hive",2),("hive",1),("hive",2),("hive",1),("hive",3),("spark",2),("spark",2)))
    bd.countByKey().foreach(println(_))
    println("--------------------------------")
    bd.countByValue().foreach(println(_))

原文地址:https://www.cnblogs.com/students/p/14262935.html