一些usefull的mapper

The function generates a (possibly empty) list of (K2, V2) pairs for a given (K1, V1)
input pair. The OutputCollector receives the output of the mapping process, and
the Reporter provides the option to record extra information about the mapper as
the task progresses.
Hadoop provides a few useful mapper implementations. You can see some of them
in the table 3.2.
Table 3.2 Some useful Mapper implementations predefined by Hadoop
Class Description
IdentityMapper<K,V> Implements Mapper<K,V,K,V> and maps inputs directly to outputs
InverseMapper<K,V> Implements Mapper<K,V,V,K> and reverses the key/value pair
RegexMapper<K> Implements Mapper<K,Text,Text,LongWritable> and generates a
(match, 1) pair for every regular expression match
TokenCountMapper<K> Implements Mapper<K,Text,Text,LongWritable> and generates a
(token, 1) pair when the input value is tokenized
As the MapReduce name implies, the major data flow operation after map is the reduce
phase, shown in the bottom part of figure 3.1.
原文地址:https://www.cnblogs.com/chenli0513/p/2290864.html