Hadoop 基本概念

1. Combiner

combiner is between map and reduce, similar to reducer, combine some data before reducer.

http://hadooptutorial.wikispaces.com/Custom+combiner

http://wiki.apache.org/hadoop/HadoopMapReduce

http://blog.optimal.io/3-differences-between-a-mapreduce-combiner-and-reducer/

2. Partitioner

partitioner is between map and reduce, further partition data that has the same key

http://hadooptutorial.wikispaces.com/Custom+partitioner

3. group

if you want to customize how map output data is grouped, use group comparator.

 4. sort

http://stackoverflow.com/questions/16184745/what-is-difference-between-sort-comparator-and-group-comparator-in-hadoop

SortComparator decides how map output keys are sorted while GroupComparator decides which map output keys within the Reducer go to the same reduce method call.

4. whole picture

http://stackoverflow.com/questions/18395998/hadoop-map-reduce-secondary-sorting

原文地址:https://www.cnblogs.com/phoenix13suns/p/4462528.html