Step 0: 安装及启动

一、Setting up a Single Node Cluster:

  http://hadoop.apache.org/docs/r2.6.5/hadoop-project-dist/hadoop-common/SingleCluster.html

  1、目的:如何配置一个单节点集群,使之掌握对MR和HDFS的使用。

  2、依赖软件:

    JDK

    SSH

  3、启动前的准备:

    下载二进制包,解压进入主目录

    只需要修改 etc/hadoop/hadoop-env.sh 配置中两项

 # set to the root of your Java installation

 export JAVA_HOME=/usr/java/latest

 # Assuming your installation directory is /usr/local/hadoop

 #(当前hadoop的存放位置,佷重要,在自己的ubuntu下设置错误,执行hadoop命令会报找不到相关类)
 export HADOOP_PREFIX=/usr/local/hadoop

    执行:$ bin/hadoop,显示相关主要使用方法

  4、集群的三种启动方式:

    Local (Standalone) Mode
    Pseudo-Distributed Mode
    Fully-Distributed Mode

 二、Standalone Operation 本地单节点

  By default, Hadoop is configured to run in a non-distributed mode, as a single Java process. This is useful for debugging.

  例:统计配置文件下匹配的字符串数量

$ mkdir input
$ cp etc/hadoop/*.xml input
$ bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.5.jar grep input output 'dfs[a-z.]+'
$ cat output/*

  注:hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.5.jar wordcount input output

三、Pseudo-Distributed Mode 伪分布

      

  

  

  

  

原文地址:https://www.cnblogs.com/mzzcy/p/8987215.html