用pyspark实现Wordcount

代码:

from pyspark import SparkConf, SparkContext
conf = SparkConf().setAppName("wordcount").setMaster("local[2]")
sc = SparkContext(conf=conf)
sc.setLogLevel("ERROR")
inputdata = sc.textFile("2.txt")
output = inputdata.flatMap(lambda x: x.split(" ")).map(lambda x: (x, 1)).reduceByKey(lambda a, b: a + b)

result = output.collect()
for i in result:
print(i)

sc.stop()

结果:

 有帮助的欢迎评论打赏哈,谢谢!



原文地址:https://www.cnblogs.com/wddqy/p/11970271.html