Hadoop_MapReduce中Mapper类和Reduce类

在权威指南中，有个关于处理温度的MapReduce类，具体如下：

第一部分：Map

public class MaxTemperatureMapper extends MapReduceBase

　　　　　　implements Mapper<LongWritable,Text,Text,IntWritable>{

//其他代码

public void map(LongWritable key, Text value, OutputCollector<Text,IntWritable> output, Report reporter){

//分析一下这四个参数

该Mapper 接口是一个泛型类型，有四个形参类型，分别为：

　　LongWritable key Map函数的输入键

　　Text value Map函数的输入值

OutputCollector<Text,IntWritable> output 输出键

Report reporter 输出值

}

PS ：Hadoop 自身提供一套可优化网络序列化传输的基本类型，而不直接使用Java内嵌的类型。这些类型均在 org.apache.hadoop.io 包中。

　　LongWritable 类型相当于Java中的Long类型

　　Text类型相当于Java中的String类型

　　IntWritable 类型相当于Java中的Integer类型

第二部分：Reduce

Reducer类的定义和使用

public class MaxTemperatureReducer extends MapReduceBase

　　　　　　　　implements Reducer<Text, IntWritable, Text, IntWritable>{

public void reduce(Text key, Iterable<IntWritabloe> values, Context context){

}

pS: reduce 函数也有四个形式参数类型用于指定输入和输出类型。

reduce函数的输入类型必须匹配map函数的输出类型。

即Text类型和IntWritable 类型。

第三部：运行MapReduce作业

调用Job类的方法即可。

Job对象指定作业执行规范。