Hadoop-2.7.3-本地模式安装-wordcount例子

  1. 准备虚拟机:linux-rhel-7.4-server,由于不使用虚拟机进行联网,所以选择host-only网络模式。此处,需要再VitralBox的管理菜单中的主机网络管理器新建一个虚拟网卡。安装完成虚拟机之后,默认网卡是关闭的,需要进行开启,指令如下:
    [root@hadoop-01 network-scripts]# vi ifcfg-enp0s3 #默认网卡配置
    
    
    TYPE=Ethernet
    PROXY_METHOD=none
    BROWSER_ONLY=no
    BOOTPROTO=dhcp
    DEFROUTE=yes
    IPV4_FAILURE_FATAL=no
    IPV6INIT=yes
    IPV6_AUTOCONF=yes
    IPV6_DEFROUTE=yes
    IPV6_FAILURE_FATAL=no
    IPV6_ADDR_GEN_MODE=stable-privacy
    NAME=enp0s3
    UUID=9e448496-ecd5-4122-a91f-91f91bd15f5e
    DEVICE=enp0s3
    ONBOOT=yes #修改为 yes,默认是no然后重启虚拟机
  2. 此时再来查看本机网络配置如下,就可以与宿主机的同网段的虚拟网卡进行通讯了,如果宿主机启用的网络共享,那么可以让虚拟机进行联网
    [root@hadoop-01 network-scripts]# ifconfig
    enp0s3: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500 #已经有IP分配进来
            inet 192.168.56.101  netmask 255.255.255.0  broadcast 192.168.56.255
            inet6 fe80::bcf9:1d0d:e75d:500f  prefixlen 64  scopeid 0x20<link>
            ether 08:00:27:fb:11:51  txqueuelen 1000  (Ethernet)
            RX packets 5763894  bytes 8204104505 (7.6 GiB)
            RX errors 0  dropped 0  overruns 0  frame 0
            TX packets 310622  bytes 23522131 (22.4 MiB)
            TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
    
    lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536
            inet 127.0.0.1  netmask 255.0.0.0
            inet6 ::1  prefixlen 128  scopeid 0x10<host>
            loop  txqueuelen 1  (Local Loopback)
            RX packets 1698  bytes 134024 (130.8 KiB)
            RX errors 0  dropped 0  overruns 0  frame 0
            TX packets 1698  bytes 134024 (130.8 KiB)
            TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
  3. 上传准备好的程序包
    ZBMAC-C03VQ091H:实验介质 hadoop$ ls
    ZooInspector.zip                        mysql-5.7.19-1.el7.x86_64.rpm-bundle.tar
    ZooViewer.zip                           mysql-connector-java-5.1.43-bin.jar
    apache-flume-1.7.0-bin.tar.gz           pig-0.17.0.tar.gz
    apache-hive-2.3.0-bin.tar.gz            sqoop-1.4.5.bin__hadoop-0.23.tar.gz
    hadoop-2.7.3.tar.gz                     virtualbox
    hbase-1.3.1-bin.tar.gz                  winscp513setup.exe
    hue-4.0.1.tgz                           zookeeper-3.4.10.tar.gz
    jdk-8u144-linux-x64.tar.gz
    
    #使用SCP命令:
    scp ./* hadoop-01@192.168.56.101:/home/hadoop-01/
  4. 安装JDK1.8并解压:
    tar -zxvf jdk-8u144-linux-x64.tar.gz
  5. 设置当前用户的java_home 编辑~/.bash_profile
    JAVA_HOME=/home/hadoop-02/sdk-home/jdk1.8.0_144
    export JAVA_HOME
    
    PATH=$JAVA_HOME/bin:$PATH
    export PATH
  6. java环境变量设置成功后,使用 :java -version查看版本是否正确
  7. 解压hadoop
    tar -zxvf hadoop-2.7.3.tar.gz
  8. hadoop目录结构解释:
    [hadoop-02@hadoop-02 ~]$ tree -L 3 /home/hadoop-02/sdk-home/hadoop-2.7.3/
    /home/hadoop-02/sdk-home/hadoop-2.7.3/
    |-- bin # 可执行命令
    |   |-- container-executor
    |   |-- hadoop|   |-- yarn
    |   `-- yarn.cmd
    |-- etc
    |   `-- hadoop # 配置文件目录
    |       |-- capacity-scheduler.xml
    |       |-- configuration.xsl
    |       |       |-- yarn-env.sh
    |       `-- yarn-site.xml
    |-- include
    |   |-- hdfs.h
    |  |   `-- TemplateFactory.hh
    |-- lib
    |   `-- native
    |       |-- libhadoop.a
    |       |-- libhadooppipes.a
    |    
    |       `-- libhdfs.so.0.0.0
    |-- libexec
    |   |-- hadoop-config.cmd
    |   |-- hadoop-config.sh
    |  |-- LICENSE.txt
    |-- logs
    |   |-- hadoop-hadoop-02-datanode-hadoop-02.log
    |   |-- hadoop-hadoop-02-datanode-hadoop-02.out
    |   |-- |-- NOTICE.txt
    |-- README.txt
    |-- sbin # 启停脚本
    |   |-- distribute-exclude.sh
    |   |-- hadoop-daemon.sh|   `-- yarn-daemons.sh
    `-- share
        |-- doc #文档目录
        |   `-- hadoop
        `-- hadoop  #所有jar包
            |-- common
            |-- hdfs
            |-- httpfs
            |-- kms
            |-- mapreduce #内含示例jar包
            |-- tools
            `-- yarn
  9. 设置hadoop的环境变量:
    # /hadoop-2.7.3/etc/hadoop/hadoop-env.sh
    
    #修改JAVA_HOME为实际对应目录:
    # The java implementation to use.
    export JAVA_HOME=/home/hadoop-02/sdk-home/jdk1.8.0_144/
    

      

  10. 至此本机环境已经准备好找到hadoop的sbin目录执行start-all.sh
    [hadoop-02@hadoop-02 sbin]$ ./start-all.sh
    This script is Deprecated. Instead use start-dfs.sh and start-yarn.sh
    Incorrect configuration: namenode address dfs.namenode.servicerpc-address or dfs.namenode.rpc-address is not configured.
    Starting namenodes on []
    hadoop-02@localhost's password:
    localhost: starting namenode, logging to /home/hadoop-02/sdk-home/hadoop-2.7.3/logs/hadoop-hadoop-02-namenode-hadoop-02.out
    hadoop-02@localhost's password:
    localhost: starting datanode, logging to /home/hadoop-02/sdk-home/hadoop-2.7.3/logs/hadoop-hadoop-02-datanode-hadoop-02.out
    Starting secondary namenodes [0.0.0.0]
    hadoop-02@0.0.0.0's password:
    0.0.0.0: starting secondarynamenode, logging to /home/hadoop-02/sdk-home/hadoop-2.7.3/logs/hadoop-hadoop-02-secondarynamenode-hadoop-02.out
    0.0.0.0: Exception in thread "main" java.lang.IllegalArgumentException: Invalid URI for NameNode address (check fs.defaultFS): file:/// has no authority.
    0.0.0.0: 	at org.apache.hadoop.hdfs.server.namenode.NameNode.getAddress(NameNode.java:472)
    0.0.0.0: 	at org.apache.hadoop.hdfs.server.namenode.NameNode.getAddress(NameNode.java:462)
    0.0.0.0: 	at org.apache.hadoop.hdfs.server.namenode.NameNode.getServiceAddress(NameNode.java:455)
    0.0.0.0: 	at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.initialize(SecondaryNameNode.java:229)
    0.0.0.0: 	at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.<init>(SecondaryNameNode.java:192)
    0.0.0.0: 	at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.main(SecondaryNameNode.java:671)
    starting yarn daemons
    starting resourcemanager, logging to /home/hadoop-02/sdk-home/hadoop-2.7.3/logs/yarn-hadoop-02-resourcemanager-hadoop-02.out
    hadoop-02@localhost's password:
    localhost: starting nodemanager, logging to /home/hadoop-02/sdk-home/hadoop-2.7.3/logs/yarn-hadoop-02-nodemanager-hadoop-02.out
    [hadoop-02@hadoop-02 sbin]$
    

      

  11. 中间如果没有配置免密登录,会出现四次输入密码,注意观察日志在启动对应的服务。
  12. 检查服务是否正常,包含如下服务:
    [hadoop-02@hadoop-02 sbin]$ jps
    6305 Jps
    6178 NodeManager
    5883 ResourceManager
    [hadoop-02@hadoop-02 sbin]$
    

      

  13. 运行wordcount 示例:
    [hadoop-02@hadoop-02 sbin]$ hadoop jar /home/hadoop-02/sdk-home/hadoop-2.7.3/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.3.jar wordcount /home/hadoop-02/test_hadoop/wordcount.txt /home/hadoop-02/test_hadoop/wordcount_output/
    

      

  14. 到输出目录就能看到结果文件如下:
    [hadoop-02@hadoop-02 sbin]$ cd /home/hadoop-02/test_hadoop/wordcount_output
    [hadoop-02@hadoop-02 wordcount_output]$ ls
    _SUCCESS  part-r-00000
    

      

  15. 至此本地环境搭建就介绍到这里
原文地址:https://www.cnblogs.com/sunlightlee/p/10235921.html