spark:reducebykey与groupbykey的区别

从源码看:

reduceBykey与groupbykey:

都调用函数combineByKeyWithClassTag[V]((v: V) => v, func, func, partitioner)
reduceBykey的map端进行聚合combine操作
mapSideCombine = true

groupbykey的mapSideCombine = false

原文地址:https://www.cnblogs.com/hejunhong/p/12906105.html