hadoop实验:求气象数据的最低温度


1.下载部分数据。由于实验就仅仅下载2003年的部分气象数据

2.通过zcat *gz > sample.txt命令解压重定向

[hadoop@Master test_data]$ zcat *gz > /home/hadoop/input/sample.txt

3.查看数据格式

4.把文件sample.txt放进hdfs文件系统里

[hadoop@Master input]$ hadoop fs -put /home/hadoop/input/sample.txt  /user/hadoop/in/sample.txt

5.Maper : MinTemperatureMapper.java


 import java.io.IOException;
 import org.apache.hadoop.io.IntWritable;
 import org.apache.hadoop.io.LongWritable;
 import org.apache.hadoop.io.Text;
 import org.apache.hadoop.mapreduce.Mapper;

 public class MinTemperatureMapper
   extends Mapper<LongWritable, Text, Text, IntWritable>
 {

   private static final int MISSING = -9999;

   @Override
   public void map(LongWritable key, Text value, Context context)
         throws IOException, InterruptedException{

     String line = value.toString();
     String year = line.substring(0,4);
     int airTemperature;
     airTemperature= Integer.parseInt(line.substring(14, 19).trim());

     if (airTemperature!= MISSING) {
     context.write(new Text(year), new IntWritable(airTemperature));
     }
   }

6.Reducer :MinTemperatureReducer.java

import java.io.IOException;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Reducer;

public class MinTemperatureReducer
  extends Reducer<Text, IntWritable, Text, IntWritable>
{

  @Override
  public void reduce(Text key, Iterable<IntWritable> values,Context context)
          throws IOException, InterruptedException
        {

                int minValue= Integer.MAX_VALUE;
                for (IntWritable value : values)
                {
                        minValue= Math.min(minValue, value.get());
                }
                context.write(key, new IntWritable(minValue));
        }
}


7.M-R Job :MinTemperature.java

import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;

public class MinTemperature
{
        public static void main(String[] args) throws Exception
        {
                if (args.length!= 2)
                {
                        System.err.println("Usage: MinTemperature<input path> <output path>");
                        System.exit(-1);
                }
                Job job= new Job();
                job.setJarByClass(MinTemperature.class);
                job.setJobName("Min temperature");
                FileInputFormat.addInputPath(job, new Path(args[0]));
                FileOutputFormat.setOutputPath(job, new Path(args[1]));
                job.setMapperClass(MinTemperatureMapper.class);
                job.setReducerClass(MinTemperatureReducer.class);
                job.setOutputKeyClass(Text.class);
                job.setOutputValueClass(IntWritable.class);
                System.exit(job.waitForCompletion(true) ? 0 : 1);
        }
}


8.编译,压缩成jar 包


[hadoop@Master myclass]$ javac -classpath /usr/hadoop/hadoop-core-1.2.1.jar  MinTemperature*.java


[hadoop@Master myclass]$ jar cvf MinTemperature.jar MinTemperature*.class
added manifest
adding: MinTemperature.class(in = 1417) (out= 799)(deflated 43%)
adding: MinTemperatureMapper.class(in = 1740) (out= 722)(deflated 58%)
adding: MinTemperatureReducer.class(in = 1664) (out= 707)(deflated 57%)


9.运行作业

[hadoop@Master myclass]$ hadoop jar /usr/hadoop/myclass/MinTemperature.jar MinTemperature  /user/hadoop/in/sample.txt  ./out2


运行报错。发现报错,信息例如以下



找了半天原因。发现是没删掉class ,程序找不到类。在myclass 文件下删掉class文件。仅仅保留生成的jar包

[hadoop@Master myclass]$ rm MinTemperature*.class


10.查看结果










原文地址:https://www.cnblogs.com/yjbjingcha/p/7147395.html