hadoop分布式安装

先配置免登录

先准备了7个节点

192.168.101.172       node1          
192.168.101.206       node2
192.168.101.207       node3
192.168.101.215       node4
192.168.101.216       node5
192.168.101.217       node6
192.168.101.218       node7

每台机器都修改hosts文件如下:

127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
192.168.101.172 node1
192.168.101.206 node2
192.168.101.207 node3
192.168.101.215 node4
192.168.101.216 node5
192.168.101.217 node6
192.168.101.218 node7

修改主机名:

vim /etc/sysconfig/network
NETWORKING=yes
HOSTNAME=node2

关防火墙:

centos6:

service iptables stop

centos7:

systemctl stop firewalld
systemctl disable firewalld

创建用户:

groupadd hadoop
useradd hadoop -g hadoop
mkdir /main/bigdata
chown -R hadoop:hadoop /main/bigdata/
passwd hadoop

时间同步centos6:

yum -y installntp

ntpdate cn.pool.ntp.org

开始配置免登录:

免密的核心思想就是:如果B服务器authorized_keys有A服务器的公钥(锁),那么A服务器可以免密登录B服务器。 

先到7台机器分别生成密钥:


# 安装ssh
yum -y install openssh-clients  

su hadoop
rm -rf ~/.ssh/*
ssh-keygen -t rsa # 一路回车,然后拷贝到授权文件
cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys

接着要让一台机器可以访问其他的所有机器:

参考了 https://blog.csdn.net/ITYang_/article/details/70144395

先在所有6台机器上执行如家命令,把各个机器的密钥发送到node1(172)机器的authorized_keys中去:



#centos6
ssh-copy-id "-p 2121 192.168.101.172"
#centos7
ssh-copy-id -p 2121 hadoop@node1


# 或更详细的: 当上一步是dsa的时候,这里必须要指定算法是
id_dsa.pub
ssh-copy-id -i ~/.ssh/id_dsa.pub "-p 2121 192.168.101.172"

把我们的公钥都传递给172这个节点.  在172(node1)上执行: 

cat .ssh/authorized_keys

会看到别的机器节点密钥都传上来了

测试:

此时在别的节点上执行 

ssh node1 -p 2121

发现可以免密码进入node1 !

下一步是把我们辛苦收集到node1机器的.ssh/authorized_keys文件分发到各个节点上去:

在node1的机器上执行一个shell:

yum install expect

然后:

#!/bin/bash
SERVERS="192.168.101.172 192.168.101.206 192.168.101.207 192.168.101.215 192.168.101.216 192.168.101.217 192.168.101.218"
PASSWORD=机器密码

# 将当前机器的密钥copy到其他的节点上,为了保证下一步scp的时候可以免密登录 auto_ssh_copy_id() { expect
-c "set timeout -1;
# 如果缺省端口的话 可以直接spawn ssh-copy-id $1; spawn ssh-copy-id " -p 2121 $1 "; expect { *(yes/no)* {send -- yes ;exp_continue;} *assword:* {send -- $2 ;exp_continue;} eof {exit 0;} }"; }
# 循环所有的机器,开始copy ssh_copy_id_to_all() {
for SERVER in $SERVERS do auto_ssh_copy_id $SERVER $PASSWORD done }
# 调用上面的方法 ssh_copy_id_to_all
# 循环所有的机器ip,scp把当前机器的密钥全都copy到别的机器上去(端口2121,如果缺省的话可以不填写) 用户名hadoop根据实际情况更换
for SERVER in $SERVERS do scp -P 2121 ~/.ssh/authorized_keys hadoop@$SERVER:~/.ssh/ done

centos7的 copy-id命令语法变了,所以shell变为:

#!/bin/bash
SERVERS="node1 node2 node3 node4 node5 node6 node7 node8"
USERNAME=hadoop
PASSWORD=机器密码

# 将当前机器的密钥copy到其他的节点上,为了保证下一步scp的时候可以免密登录
auto_ssh_copy_id() {
    expect -c "set timeout -1;
        # 如果缺省端口的话  可以直接spawn ssh-copy-id $1;
        spawn ssh-copy-id  -p 2121 $2@$1 ;
        expect {
            *(yes/no)* {send -- yes
;exp_continue;}
            *assword:* {send -- $3
;exp_continue;}
            eof        {exit 0;}
        }";
}

# 循环所有的机器,开始copy
ssh_copy_id_to_all() {
    for SERVER in $SERVERS
        do
            auto_ssh_copy_id $SERVER  $USERNAME $PASSWORD
        done
}

# 调用上面的方法
ssh_copy_id_to_all


# 循环所有的机器ip,scp把当前机器的密钥全都copy到别的机器上去(端口2121,如果缺省的话可以不填写)  用户名hadoop根据实际情况更换
for SERVER in $SERVERS
do
    scp -P 2121 ~/.ssh/authorized_keys $USERNAME@$SERVER:~/.ssh/
done

但是实际情况在使用非root用户的时候发现了很奇怪的现象:

centos6小插曲:

node1收集到别的6个节点的密钥之后,反向ssh-copy-id回去之后,无法ssh免密码登录对方,scp的时候仍然需要输入密码....

但是别的节点删掉本地的~/.ssh/authorized_keys 之后就可以了.....

结果我就在其他6个节点执行了rm ~/.ssh/authorized_keys之后,再在node1运行这个shell就可以不用输入密码了 ....

root用户没有这个问题...

好奇怪.  有读者解决了之后可以和我沟通下,共同学习.

在这台node1机器分别执行ssh nodeX -p xxxx 之后选yes,会缓存knowhost

然后scp这个文件到其他的机器上去:

#!/bin/bash
SERVERS="node1 node2 node3 node4 node5 node6 node7 node8"
for SERVER in $SERVERS
do
    scp -P 2121 ~/.ssh/known_hosts hadoop@$SERVER:~/.ssh/
done

这样其他的机器就不用首次输入yes了

至此,机器面密码登录搞定.

安装Java8:

cd /main
tar -zvxf soft/jdk-8u171-linux-x64.tar.gz -C .

vim /etc/profile

# 最后加上
export JAVA_HOME=/main/jdk1.8.0_171
export JRE_HOME=$JAVA_HOME/jre
export CLASSPATH=$JAVA_HOME/lib  
export PATH=:$PATH:$JAVA_HOME/bin:$JRE_HOME/bin


rm -rf /usr/bin/java* # 立即生效 source
/etc/profile

zookeeper安装配置省略 ,基本就是解压 改路径,看心情改端口,设置myid就ok

# The number of milliseconds of each tick
tickTime=2000
# The number of ticks that the initial 
# synchronization phase can take
initLimit=10
# The number of ticks that can pass between 
# sending a request and getting an acknowledgement
syncLimit=5
# the directory where the snapshot is stored.
# do not use /tmp for storage, /tmp here is just 
# example sakes.
dataDir=/main/zookeeper/data
dataLogDir=/main/zookeeper/logs
# the port at which the clients will connect
clientPort=2181
# the maximum number of client connections.
# increase this if you need to handle more clients
#maxClientCnxns=60
#
# Be sure to read the maintenance section of the 
# administrator guide before turning on autopurge.
#
# http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance
#
# The number of snapshots to retain in dataDir
#autopurge.snapRetainCount=3
# Purge task interval in hours
# Set to "0" to disable auto purge feature
#autopurge.purgeInterval=1
server.1=192.168.101.173:2888:3888
server.2=192.168.101.183:2888:3888
server.3=192.168.101.193:2888:3888

安装hadoop:

安装5个节点

准备在node1-2五台机器安装,其中node1/node2作为master,   node3-5作为slave

在node1的机器上 进行配置:

cd /main
tar -zvxf soft/hadoop-2.8.4.tar.gz -C .
chown -R hadoop:hadoop /main/hadoop-2.8.4/

su hadoop
mkdir /main/hadoop-2.8.4/data
mkdir /main/hadoop-2.8.4/data/journal
mkdir /main/hadoop-2.8.4/data/tmp
mkdir /main/hadoop-2.8.4/data/hdfs
mkdir /main/hadoop-2.8.4/data/hdfs/namenode
mkdir /main/hadoop-2.8.4/data/hdfs/datanode

 继续修改/main/hadoop-2.8.4/etc/hadoop/下的:

hadoop-env.sh

mapred-env.sh

yarn-env.sh

三个文件设置java_home

export JAVA_HOME=/main/jdk1.8.0_171

hadoop-env.sh最后一行加上ssh的端口

export HADOOP_SSH_OPTS="-p 2121"

修改/main/hadoop-2.8.4/etc/hadoop/slaves文件

node3
node4
node5

 修改/main/hadoop-2.8.4/etc/hadoop/core-site.xml

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->

<!-- Put site-specific property overrides in this file. -->

<configuration>
    <!-- fs.defaultFS需要设置成HDFS的逻辑服务名(需与hdfs-site.xml中的dfs.nameservices一致) -->
    <property>  
        <name>fs.defaultFS</name>  
        <value>hdfs://hjbdfs</value>  
    </property>  
    <property>  
        <name>hadoop.tmp.dir</name>  
        <value>/main/hadoop-2.8.4/data/tmp</value>  
    </property>  
    <property>  
        <name>hadoop.http.staticuser.user</name>  
        <value>hadoop</value>  
    </property>  
    <property>  
        <name>ha.zookeeper.quorum</name>  
        <value>192.168.101.173:2181,192.168.101.173:2181,192.168.101.173:2181</value>  
    </property>
</configuration>

更多参数详见: http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/core-default.xml

修改 hdfs-site.xml:

注意故障切换sshfence节点的端口如果ssh不是默认的22端口,需要设置为sshfence([[username][:port]])比如 sshfence(hadoop:2121)  否则active的namenode挂了之后,sshfence无法进去到另外一台机器去,导致无法自动切换主备.

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->

<!-- Put site-specific property overrides in this file. -->

<configuration>
    <property>
        <name>dfs.replication</name>
        <value>3</value>
    </property>
    <property>
        <name>dfs.permissions.enabled</name>
        <value>false</value>
    </property>
    <!-- HDFS NN的逻辑名称,使用上面crore中fs.defaultFS设置的hjbdfs -->
    <property>
        <name>dfs.nameservices</name>
        <value>hjbdfs</value>
    </property>
    <property>
        <name>dfs.blocksize</name>
        <value>134217728</value>
    </property>
    <!-- 给定服务逻辑名称myhdfs的节点列表 -->
    <property>
        <name>dfs.ha.namenodes.hjbdfs</name>
        <value>nn1,nn2</value>
    </property>
    <!-- nn1的RPC通信地址,nn1所在地址  -->
    <property>
        <name>dfs.namenode.rpc-address.hjbdfs.nn1</name>
        <value>node1:8020</value>
    </property>
    <!-- nn1的http通信地址,外部访问地址 -->
    <property>
        <name>dfs.namenode.http-address.hjbdfs.nn1</name>
        <value>node1:50070</value>
    </property>
    <!-- nn2的RPC通信地址,nn2所在地址 -->
    <property>
        <name>dfs.namenode.rpc-address.hjbdfs.nn2</name>
        <value>node2:8020</value>
    </property>
    <!-- nn2的http通信地址,外部访问地址 -->
    <property>
        <name>dfs.namenode.http-address.hjbdfs.nn2</name>
        <value>node2:50070</value>
    </property>
    <!-- 指定NameNode的元数据在JournalNode日志上的存放位置(一般和zookeeper部署在一起) -->
    <!--  设置一组 journalNode 的 URI 地址,active NN 将 edit log 写入这些JournalNode,而 standby NameNode 读取这些 edit log,并作用在内存中的目录树中。如果journalNode有多个节点则使用分号分割。该属性值应符合以下格式qjournal://host1:port1;host2:port2;host3:port3/journalId -->
    <property>
        <name>dfs.namenode.shared.edits.dir</name>
        <value>qjournal://node3:8485;node4:8485;node5:8485/hjbdf_journal</value>
    </property>
    <!-- 指定JournalNode在本地磁盘存放数据的位置 -->
    <property>
        <name>dfs.journalnode.edits.dir</name>
        <value>/main/hadoop-2.8.4/data/journal</value>
    </property>
    <!--指定cluster1出故障时,哪个实现类负责执行故障切换  -->
    <property>
        <name>dfs.client.failover.proxy.provider.hjbdfs</name>
        <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
    </property>
    <!--解决HA集群脑裂问题(即出现两个 master 同时对外提供服务,导致系统处于不一致状态)。在 HDFS HA中,JournalNode 只允许一个 NameNode 写数据,不会出现两个 active NameNode 的问题.这是配置自动切换的方法,有多种使用方法,具体可以看官网,在文末会给地址,这里是远程登录杀死的方法  -->
    <property>
        <name>dfs.ha.fencing.methods</name>
        <value>sshfence(hadoop:2121)</value>
        <description>how to communicate in the switch process</description>
    </property>
    <!-- 这个是使用sshfence隔离机制时才需要配置ssh免登陆 -->
    <property>
        <name>dfs.ha.fencing.ssh.private-key-files</name>
        <value>/home/hadoop/.ssh/id_rsa</value>
    </property>
    <!-- 配置sshfence隔离机制超时时间,这个属性同上,如果你是用脚本的方法切换,这个应该是可以不配置的 -->
    <property>
        <name>dfs.ha.fencing.ssh.connect-timeout</name>
        <value>30000</value>
    </property>
    <!-- 这个是开启自动故障转移,如果你没有自动故障转移,这个可以先不配 -->
    <property>
        <name>dfs.ha.automatic-failover.enabled</name>
        <value>true</value>
    </property>
    <property>
        <name>dfs.datanode.data.dir</name>
        <value>/main/hadoop-2.8.4/data/hdfs/datanode</value>
    </property>
    <property>
        <name>dfs.namenode.name.dir</name>
        <value>/main/hadoop-2.8.4/data/hdfs/namenode</value>
    </property>
</configuration>

参考 http://ju.outofmemory.cn/entry/95494

https://www.cnblogs.com/meiyuanbao/p/3545929.html

官方参数: http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml

修改mapred-site.xml.template名称为mapred-site.xml并修改:

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->

<!-- Put site-specific property overrides in this file. -->

<configuration>
    <property>    
        <name>mapreduce.framework.name</name>    
        <value>yarn</value>    
    </property>    
    <property>    
        <name>mapreduce.jobhistory.address</name>    
        <value>node1:10020</value>    
    </property>    
    <property>    
        <name>mapreduce.jobhistory.webapp.address</name>    
        <value>node1:19888</value>    
    </property>    
</configuration>

配置yarn-site.xml:

<?xml version="1.0"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->
<configuration>
    <property>  
        <name>yarn.nodemanager.aux-services</name>  
        <value>mapreduce_shuffle</value>  
    </property>
    <!-- Site specific YARN configuration properties -->
    <!--启用resourcemanager ha-->  
    <!--是否开启RM ha,默认是开启的-->  
    <property>  
       <name>yarn.resourcemanager.ha.enabled</name>  
       <value>true</value>  
    </property>  
    <!--声明两台resourcemanager的地址-->  
    <property>  
       <name>yarn.resourcemanager.cluster-id</name>  
       <value>rmcluster</value>  
    </property>  
    <property>  
       <name>yarn.resourcemanager.ha.rm-ids</name>  
       <value>rm1,rm2</value>  
    </property>  
    <property>  
       <name>yarn.resourcemanager.hostname.rm1</name>  
       <value>node1</value>  
    </property>  
    <property>  
       <name>yarn.resourcemanager.hostname.rm2</name>  
       <value>node2</value>  
    </property>  
   
    <!--指定zookeeper集群的地址-->   
    <property>  
       <name>yarn.resourcemanager.zk-address</name>  
        <value>192.168.101.173:2181,192.168.101.173:2181,192.168.101.173:2181</value>  
    </property>  
    <!--启用自动恢复,当任务进行一半,rm坏掉,就要启动自动恢复,默认是false-->   
    <property>  
       <name>yarn.resourcemanager.recovery.enabled</name>  
       <value>true</value>  
    </property>  
   
    <!--指定resourcemanager的状态信息存储在zookeeper集群,默认是存放在FileSystem里面。-->   
    <property>  
       <name>yarn.resourcemanager.store.class</name>  
       <value>org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore</value>  
    </property> 

</configuration>

以上基础的都配置完了,打包:

tar -zcvf  hadoop-2.8.4.ready.tar.gz  /main/hadoop-2.8.4

然后用scp把包拷贝到其他4个节点

scp -P 2121 hadoop-2.8.4.ready.tar.gz hadoop@node2:/main

去另外4个节点解压  重命名文件夹为/main/hadoop-2.8.4即可

然后继续配置node1  node2这两个master

分别在node1和node2的yarn-site.xml上添加yarn.resourcemanager.ha.id :

类似与zookeeper的myid

<property>  
   <name>yarn.resourcemanager.ha.id</name>  
   <value>rm1</value>  
</property>
<property>  
   <name>yarn.resourcemanager.ha.id</name>  
   <value>rm2</value>  
</property>

启动journal:

三个slave节点启动journal

[hadoop@node5 hadoop-2.8.4]$ /main/hadoop-2.8.4/sbin/hadoop-daemon.sh start journalnode
starting journalnode, logging to /main/hadoop-2.8.4/logs/hadoop-hadoop-journalnode-node5.out
[hadoop@node5 hadoop-2.8.4]$ jps
2272 Jps
2219 JournalNode

 启动namenode:

格式化一个master的namenode:  journal要起来才能格式化 

不能再次格式化或者在另外的节点再次格式化,否则会导致nn和dn的namespaceID不一致而报错!!!

/main/hadoop-2.8.4/bin/hdfs namenode -format

启动 一个 namenode:

[hadoop@node1 hadoop]$ /main/hadoop-2.8.4/sbin/hadoop-daemon.sh start namenode
starting namenode, logging to /main/hadoop-2.8.4/logs/hadoop-hadoop-namenode-node1.out
[hadoop@node1 hadoop]$ jps
7536 Jps
7457 NameNode
[hadoop@node1 hadoop]$ ps -ef|grep namenode
hadoop    7457     1 10 09:20 pts/4    00:00:08 /main/jdk1.8.0_171/bin/java -Dproc_namenode -Xmx1000m -Djava.net.preferIPv4Stack=true -Dhadoop.log.dir=/main/hadoop-2.8.4/logs -Dhadoop.log.file=hadoop.log -Dhadoop.home.dir=/main/hadoop-2.8.4 -Dhadoop.id.str=hadoop -Dhadoop.root.logger=INFO,console -Djava.library.path=/main/hadoop-2.8.4/lib/native -Dhadoop.policy.file=hadoop-policy.xml -Djava.net.preferIPv4Stack=true -Djava.net.preferIPv4Stack=true -Djava.net.preferIPv4Stack=true -Dhadoop.log.dir=/main/hadoop-2.8.4/logs -Dhadoop.log.file=hadoop-hadoop-namenode-node1.log -Dhadoop.home.dir=/main/hadoop-2.8.4 -Dhadoop.id.str=hadoop -Dhadoop.root.logger=INFO,RFA -Djava.library.path=/main/hadoop-2.8.4/lib/native -Dhadoop.policy.file=hadoop-policy.xml -Djava.net.preferIPv4Stack=true -Dhadoop.security.logger=INFO,RFAS -Dhdfs.audit.logger=INFO,NullAppender -Dhadoop.security.logger=INFO,RFAS -Dhdfs.audit.logger=INFO,NullAppender -Dhadoop.security.logger=INFO,RFAS -Dhdfs.audit.logger=INFO,NullAppender -Dhadoop.security.logger=INFO,RFAS org.apache.hadoop.hdfs.server.namenode.NameNode

此时应该可以访问node1的hdfs界面,状态为standby:

http://192.168.101.172:50070/

然后另外一个master先同步上一个master namenode的元数据信息再启动:

主要是为了同步data/hdfs/namenode/这些信息(包括namespaceID),否则两个节点不一致 会报错

/main/hadoop-2.8.4/bin/hdfs namenode -bootstrapStandby

会有成功下在到文件的日志:

18/06/28 11:01:56 WARN common.Util: Path /main/hadoop-2.8.4/data/hdfs/namenode should be specified as a URI in configuration files. Please update hdfs configuration.
18/06/28 11:01:56 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
=====================================================
About to bootstrap Standby ID nn2 from:
           Nameservice ID: hjbdfs
        Other Namenode ID: nn1
  Other NN's HTTP address: http://node1:50070
  Other NN's IPC  address: node1/192.168.101.172:8020
             Namespace ID: 675265321
            Block pool ID: BP-237410497-192.168.101.172-1530153904905
               Cluster ID: CID-604da42a-d0a8-403b-b073-68c857c9b772
           Layout version: -63
       isUpgradeFinalized: true
=====================================================
Re-format filesystem in Storage Directory /main/hadoop-2.8.4/data/hdfs/namenode ? (Y or N) Y
18/06/28 11:02:06 INFO common.Storage: Storage directory /main/hadoop-2.8.4/data/hdfs/namenode has been successfully formatted.
18/06/28 11:02:06 WARN common.Util: Path /main/hadoop-2.8.4/data/hdfs/namenode should be specified as a URI in configuration files. Please update hdfs configuration.
18/06/28 11:02:06 WARN common.Util: Path /main/hadoop-2.8.4/data/hdfs/namenode should be specified as a URI in configuration files. Please update hdfs configuration.
18/06/28 11:02:06 INFO namenode.FSEditLog: Edit logging is async:true
18/06/28 11:02:06 INFO namenode.TransferFsImage: Opening connection to http://node1:50070/imagetransfer?getimage=1&txid=0&storageInfo=-63:675265321:1530153904905:CID-604da42a-d0a8-403b-b073-68c857c9b772&bootstrapstandby=true
18/06/28 11:02:06 INFO namenode.TransferFsImage: Image Transfer timeout configured to 60000 milliseconds
18/06/28 11:02:06 INFO namenode.TransferFsImage: Transfer took 0.01s at 0.00 KB/s
18/06/28 11:02:06 INFO namenode.TransferFsImage: Downloaded file fsimage.ckpt_0000000000000000000 size 323 bytes.
18/06/28 11:02:06 INFO util.ExitUtil: Exiting with status 0
18/06/28 11:02:06 INFO namenode.NameNode: SHUTDOWN_MSG: 
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at node2/192.168.101.206
************************************************************/

接着启动:

/main/hadoop-2.8.4/sbin/hadoop-daemon.sh start namenode

web界面的状态同样也是standby

手动强制让一个节点变为active的主节点,有问题,会抛出EOFExcption,导致namenode节点挂掉:

[hadoop@node1 hadoop-2.8.4]$ /main/hadoop-2.8.4/bin/hdfs haadmin -transitionToActive nn1 --forcemanual

使用zookeeper自动接管namenode: 

先把整个集群关闭,zookeeper不关,输入bin/hdfs zkfc -formatZK,格式化ZKFC

关闭

/main/hadoop-2.8.4/sbin/stop-dfs.sh

 启动

/main/hadoop-2.8.4/bin/hdfs zkfc -formatZK

执行完毕之后zookeepr多出来一个hadoop-ha的节点: 

启动整个集群:

/main/hadoop-2.8.4/sbin/start-dfs.sh 

可以看到这个命令先后启动了namenode   datanode  journalnode 并启动了zkfc,进行了注册:

[hadoop@node1 hadoop-2.8.4]$ /main/hadoop-2.8.4/sbin/start-dfs.sh 
18/06/28 11:57:59 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Starting namenodes on [node1 node2]
node1: starting namenode, logging to /main/hadoop-2.8.4/logs/hadoop-hadoop-namenode-node1.out
node2: starting namenode, logging to /main/hadoop-2.8.4/logs/hadoop-hadoop-namenode-node2.out
node3: starting datanode, logging to /main/hadoop-2.8.4/logs/hadoop-hadoop-datanode-node3.out
node4: starting datanode, logging to /main/hadoop-2.8.4/logs/hadoop-hadoop-datanode-node4.out
node5: starting datanode, logging to /main/hadoop-2.8.4/logs/hadoop-hadoop-datanode-node5.out
Starting journal nodes [node3 node4 node5]
node3: starting journalnode, logging to /main/hadoop-2.8.4/logs/hadoop-hadoop-journalnode-node3.out
node4: starting journalnode, logging to /main/hadoop-2.8.4/logs/hadoop-hadoop-journalnode-node4.out
node5: starting journalnode, logging to /main/hadoop-2.8.4/logs/hadoop-hadoop-journalnode-node5.out
18/06/28 11:58:17 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Starting ZK Failover Controllers on NN hosts [node1 node2]
node1: starting zkfc, logging to /main/hadoop-2.8.4/logs/hadoop-hadoop-zkfc-node1.out
node2: starting zkfc, logging to /main/hadoop-2.8.4/logs/hadoop-hadoop-zkfc-node2.out

 可以看到zookeeper上已经注册好了服务:

zookeeper上两个节点的内容:

[zk: localhost:2181(CONNECTED) 10] get /hadoop-ha/hjbdfs/ActiveBreadCrumb

hjbdfsnn2node2 �>(�>
cZxid = 0x30000000a
ctime = Thu Jun 28 11:49:36 CST 2018
mZxid = 0x30000000a
mtime = Thu Jun 28 11:49:36 CST 2018
pZxid = 0x30000000a
cversion = 0
dataVersion = 0
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 26
numChildren = 0
[zk: localhost:2181(CONNECTED) 16] get /hadoop-ha/hjbdfs/ActiveStandbyElectorLock

hjbdfsnn1node1 �>(�>
cZxid = 0x3000001a7
ctime = Thu Jun 28 11:54:13 CST 2018
mZxid = 0x3000001a7
mtime = Thu Jun 28 11:54:13 CST 2018
pZxid = 0x3000001a7
cversion = 0
dataVersion = 0
aclVersion = 0
ephemeralOwner = 0x163f1ef62ad008c
dataLength = 26
numChildren = 0

此时的NameNode:

[hadoop@node1 hadoop-2.8.4]$ jps
14885 NameNode
15191 DFSZKFailoverController
15321 Jps
[hadoop@node2 hadoop-2.8.4]$ jps
18850 NameNode
19059 Jps
18952 DFSZKFailoverController

3个DataNode都一样:

[hadoop@node3 hadoop-2.8.4]$ jps
5409 DataNode
5586 Jps
5507 JournalNode

 此时可以发现一个节点是active的,一个节点是standby的. 

初步判断安装成功,接下来是测试:

后来重启节点之后,active的节点是node1,开始zookeeper对NameNode的自动故障切换测试:

kill掉主节点Node1的NN进程:

[hadoop@node1 hadoop-2.8.4]$ jps
18850 NameNode
18952 DFSZKFailoverController
19103 Jps

可以看到故障切换的hadoop-hadoop-zkfc-node2.log日志:

2018-06-28 16:07:40,181 INFO org.apache.hadoop.ha.NodeFencer: ====== Beginning Service Fencing Process... ======
2018-06-28 16:07:40,181 INFO org.apache.hadoop.ha.NodeFencer: Trying method 1/1: org.apache.hadoop.ha.SshFenceByTcpPort(hadoop:2121)
2018-06-28 16:07:40,220 INFO org.apache.hadoop.ha.SshFenceByTcpPort: Connecting to node1...
2018-06-28 16:07:40,222 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch: Connecting to node1 port 2121
2018-06-28 16:07:40,229 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch: Connection established
2018-06-28 16:07:40,236 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch: Remote version string: SSH-2.0-OpenSSH_5.3
2018-06-28 16:07:40,236 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch: Local version string: SSH-2.0-JSCH-0.1.54
2018-06-28 16:07:40,236 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch: CheckCiphers: aes256-ctr,aes192-ctr,aes128-ctr,aes256-cbc,aes192-cbc,aes128-cbc,3des-ctr,arcfour,arcfour128,arcfour256
2018-06-28 16:07:40,635 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch: CheckKexes: diffie-hellman-group14-sha1,ecdh-sha2-nistp256,ecdh-sha2-nistp384,ecdh-sha2-nistp521
2018-06-28 16:07:40,725 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch: CheckSignatures: ecdsa-sha2-nistp256,ecdsa-sha2-nistp384,ecdsa-sha2-nistp521
2018-06-28 16:07:40,729 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch: SSH_MSG_KEXINIT sent
2018-06-28 16:07:40,729 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch: SSH_MSG_KEXINIT received
2018-06-28 16:07:40,730 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch: kex: server: diffie-hellman-group-exchange-sha256,diffie-hellman-group-exchange-sha1,diffie-hellman-group14-sha1,diffie-hellman-group1-sha1
2018-06-28 16:07:40,730 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch: kex: server: ssh-rsa,ssh-dss
2018-06-28 16:07:40,730 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch: kex: server: aes128-ctr,aes192-ctr,aes256-ctr,arcfour256,arcfour128,aes128-cbc,3des-cbc,blowfish-cbc,cast128-cbc,aes192-cbc,aes256-cbc,arcfour,rijndael-cbc@lysator.liu.se
2018-06-28 16:07:40,730 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch: kex: server: aes128-ctr,aes192-ctr,aes256-ctr,arcfour256,arcfour128,aes128-cbc,3des-cbc,blowfish-cbc,cast128-cbc,aes192-cbc,aes256-cbc,arcfour,rijndael-cbc@lysator.liu.se
2018-06-28 16:07:40,730 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch: kex: server: hmac-md5,hmac-sha1,umac-64@openssh.com,hmac-sha2-256,hmac-sha2-512,hmac-ripemd160,hmac-ripemd160@openssh.com,hmac-sha1-96,hmac-md5-96
2018-06-28 16:07:40,730 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch: kex: server: hmac-md5,hmac-sha1,umac-64@openssh.com,hmac-sha2-256,hmac-sha2-512,hmac-ripemd160,hmac-ripemd160@openssh.com,hmac-sha1-96,hmac-md5-96
2018-06-28 16:07:40,730 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch: kex: server: none,zlib@openssh.com
2018-06-28 16:07:40,731 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch: kex: server: none,zlib@openssh.com
2018-06-28 16:07:40,731 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch: kex: server: 
2018-06-28 16:07:40,731 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch: kex: server: 
2018-06-28 16:07:40,731 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch: kex: client: ecdh-sha2-nistp256,ecdh-sha2-nistp384,ecdh-sha2-nistp521,diffie-hellman-group14-sha1,diffie-hellman-group-exchange-sha256,diffie-hellman-group-exchange-sha1,diffie-hellman-group1-sha1
2018-06-28 16:07:40,731 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch: kex: client: ssh-rsa,ssh-dss,ecdsa-sha2-nistp256,ecdsa-sha2-nistp384,ecdsa-sha2-nistp521
2018-06-28 16:07:40,731 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch: kex: client: aes128-ctr,aes128-cbc,3des-ctr,3des-cbc,blowfish-cbc,aes192-ctr,aes192-cbc,aes256-ctr,aes256-cbc
2018-06-28 16:07:40,731 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch: kex: client: aes128-ctr,aes128-cbc,3des-ctr,3des-cbc,blowfish-cbc,aes192-ctr,aes192-cbc,aes256-ctr,aes256-cbc
2018-06-28 16:07:40,731 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch: kex: client: hmac-md5,hmac-sha1,hmac-sha2-256,hmac-sha1-96,hmac-md5-96
2018-06-28 16:07:40,731 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch: kex: client: hmac-md5,hmac-sha1,hmac-sha2-256,hmac-sha1-96,hmac-md5-96
2018-06-28 16:07:40,731 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch: kex: client: none
2018-06-28 16:07:40,732 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch: kex: client: none
2018-06-28 16:07:40,732 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch: kex: client: 
2018-06-28 16:07:40,732 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch: kex: client: 
2018-06-28 16:07:40,732 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch: kex: server->client aes128-ctr hmac-md5 none
2018-06-28 16:07:40,732 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch: kex: client->server aes128-ctr hmac-md5 none
2018-06-28 16:07:40,769 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch: SSH_MSG_KEXDH_INIT sent
2018-06-28 16:07:40,770 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch: expecting SSH_MSG_KEXDH_REPLY
2018-06-28 16:07:40,804 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch: ssh_rsa_verify: signature true
2018-06-28 16:07:40,811 WARN org.apache.hadoop.ha.SshFenceByTcpPort.jsch: Permanently added 'node1' (RSA) to the list of known hosts.
2018-06-28 16:07:40,811 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch: SSH_MSG_NEWKEYS sent
2018-06-28 16:07:40,811 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch: SSH_MSG_NEWKEYS received
2018-06-28 16:07:40,817 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch: SSH_MSG_SERVICE_REQUEST sent
2018-06-28 16:07:40,818 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch: SSH_MSG_SERVICE_ACCEPT received
2018-06-28 16:07:40,820 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch: Authentications that can continue: gssapi-with-mic,publickey,keyboard-interactive,password
2018-06-28 16:07:40,820 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch: Next authentication method: gssapi-with-mic
2018-06-28 16:07:40,824 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch: Authentications that can continue: publickey,keyboard-interactive,password
2018-06-28 16:07:40,824 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch: Next authentication method: publickey
2018-06-28 16:07:40,902 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch: Authentication succeeded (publickey).
2018-06-28 16:07:40,902 INFO org.apache.hadoop.ha.SshFenceByTcpPort: Connected to node1
2018-06-28 16:07:40,902 INFO org.apache.hadoop.ha.SshFenceByTcpPort: Looking for process running on port 8020
2018-06-28 16:07:40,950 INFO org.apache.hadoop.ha.SshFenceByTcpPort: Indeterminate response from trying to kill service. Verifying whether it is running using nc...
2018-06-28 16:07:40,966 WARN org.apache.hadoop.ha.SshFenceByTcpPort: nc -z node1 8020 via ssh: bash: nc: command not found
2018-06-28 16:07:40,968 INFO org.apache.hadoop.ha.SshFenceByTcpPort: Verified that the service is down.
2018-06-28 16:07:40,968 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch: Disconnecting from node1 port 2121
2018-06-28 16:07:40,972 INFO org.apache.hadoop.ha.NodeFencer: ====== Fencing successful by method org.apache.hadoop.ha.SshFenceByTcpPort(hadoop:2121) ======
2018-06-28 16:07:40,972 INFO org.apache.hadoop.ha.ActiveStandbyElector: Writing znode /hadoop-ha/hjbdfs/ActiveBreadCrumb to indicate that the local node is the most recent active...
2018-06-28 16:07:40,973 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch: Caught an exception, leaving main loop due to Socket closed
2018-06-28 16:07:40,979 INFO org.apache.hadoop.ha.ZKFailoverController: Trying to make NameNode at node2/192.168.101.206:8020 active...
2018-06-28 16:07:41,805 INFO org.apache.hadoop.ha.ZKFailoverController: Successfully transitioned NameNode at node2/192.168.101.206:8020 to active state

 启动Yarn,测试resourcemanager ha ,node1输入:

[hadoop@node1 sbin]$ /main/hadoop-2.8.4/sbin/start-yarn.sh 
starting yarn daemons
starting resourcemanager, logging to /main/hadoop-2.8.4/logs/yarn-hadoop-resourcemanager-node1.out
node3: starting nodemanager, logging to /main/hadoop-2.8.4/logs/yarn-hadoop-nodemanager-node3.out
node4: starting nodemanager, logging to /main/hadoop-2.8.4/logs/yarn-hadoop-nodemanager-node4.out
node5: starting nodemanager, logging to /main/hadoop-2.8.4/logs/yarn-hadoop-nodemanager-node5.out

 node2启动resourcemanager:

[hadoop@node2 sbin]$ /main/hadoop-2.8.4/sbin/yarn-daemon.sh start resourcemanager
starting resourcemanager, logging to /main/hadoop-2.8.4/logs/yarn-hadoop-resourcemanager-node2.out

 浏览器敲node2的resourcemanager地址http://192.168.101.206:8088  会自动跳转到 http://node1:8088  也就是http://192.168.101.172:8088/cluster

 测试HDFS文件上传/下载/删除:

新建了一个  word_i_have_a_dream.txt  里面放的马丁路德金的英文演讲稿.

[hadoop@node1 hadoop-2.8.4]$ /main/hadoop-2.8.4/bin/hadoop fs -put word_i_have_a_dream.txt /word.txt
18/06/28 16:39:28 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable

[hadoop@node1 hadoop
-2.8.4]$ /main/hadoop-2.8.4/bin/hadoop fs -ls /word.txt 18/06/28 16:39:43 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable -rw-r--r-- 3 hadoop supergroup 4805 2018-06-28 16:39 /word.txt
[hadoop@node1 hadoop
-2.8.4]$ /main/hadoop-2.8.4/bin/hadoop fs -rm /word.txt 18/06/28 16:39:59 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable Deleted /word.txt
[hadoop@node1 hadoop
-2.8.4]$ /main/hadoop-2.8.4/bin/hadoop fs -ls /word.txt 18/06/28 16:40:07 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable ls: `/word.txt': No such file or directory

下载是get命令,更多见官网或其他博客如https://www.cnblogs.com/lzfhope/p/6952869.html

 运行经典wordcount测试:

[hadoop@node1 hadoop-2.8.4]$ /main/hadoop-2.8.4/bin/hadoop jar /main/hadoop-2.8.4/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.8.4.jar wordcount /word.txt /wordoutput
18/06/28 16:50:33 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
18/06/28 16:50:36 INFO input.FileInputFormat: Total input files to process : 1
18/06/28 16:50:36 INFO mapreduce.JobSubmitter: number of splits:1
18/06/28 16:50:36 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1530173899165_0001
18/06/28 16:50:37 INFO impl.YarnClientImpl: Submitted application application_1530173899165_0001
18/06/28 16:50:37 INFO mapreduce.Job: The url to track the job: http://node1:8088/proxy/application_1530173899165_0001/
18/06/28 16:50:37 INFO mapreduce.Job: Running job: job_1530173899165_0001
18/06/28 16:50:49 INFO mapreduce.Job: Job job_1530173899165_0001 running in uber mode : false
18/06/28 16:50:49 INFO mapreduce.Job:  map 0% reduce 0%
18/06/28 16:50:58 INFO mapreduce.Job:  map 100% reduce 0%
18/06/28 16:51:07 INFO mapreduce.Job:  map 100% reduce 100%
18/06/28 16:51:08 INFO mapreduce.Job: Job job_1530173899165_0001 completed successfully
18/06/28 16:51:08 INFO mapreduce.Job: Counters: 49
    File System Counters
        FILE: Number of bytes read=4659
        FILE: Number of bytes written=330837
        FILE: Number of read operations=0
        FILE: Number of large read operations=0
        FILE: Number of write operations=0
        HDFS: Number of bytes read=4892
        HDFS: Number of bytes written=3231
        HDFS: Number of read operations=6
        HDFS: Number of large read operations=0
        HDFS: Number of write operations=2
    Job Counters 
        Launched map tasks=1
        Launched reduce tasks=1
        Data-local map tasks=1
        Total time spent by all maps in occupied slots (ms)=6187
        Total time spent by all reduces in occupied slots (ms)=6411
        Total time spent by all map tasks (ms)=6187
        Total time spent by all reduce tasks (ms)=6411
        Total vcore-milliseconds taken by all map tasks=6187
        Total vcore-milliseconds taken by all reduce tasks=6411
        Total megabyte-milliseconds taken by all map tasks=6335488
        Total megabyte-milliseconds taken by all reduce tasks=6564864
    Map-Reduce Framework
        Map input records=32
        Map output records=874
        Map output bytes=8256
        Map output materialized bytes=4659
        Input split bytes=87
        Combine input records=874
        Combine output records=359
        Reduce input groups=359
        Reduce shuffle bytes=4659
        Reduce input records=359
        Reduce output records=359
        Spilled Records=718
        Shuffled Maps =1
        Failed Shuffles=0
        Merged Map outputs=1
        GC time elapsed (ms)=194
        CPU time spent (ms)=1860
        Physical memory (bytes) snapshot=444248064
        Virtual memory (bytes) snapshot=4178436096
        Total committed heap usage (bytes)=317718528
    Shuffle Errors
        BAD_ID=0
        CONNECTION=0
        IO_ERROR=0
        WRONG_LENGTH=0
        WRONG_MAP=0
        WRONG_REDUCE=0
    File Input Format Counters 
        Bytes Read=4805
    File Output Format Counters 
        Bytes Written=3231

可以看到HDFS上/wordoutput下生成了输出文件:

[hadoop@node1 hadoop-2.8.4]$ /main/hadoop-2.8.4/bin/hadoop fs -ls /
18/06/28 16:52:32 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Found 3 items
drwx------   - hadoop supergroup          0 2018-06-28 16:50 /tmp
-rw-r--r--   3 hadoop supergroup       4805 2018-06-28 16:43 /word.txt
drwxr-xr-x   - hadoop supergroup          0 2018-06-28 16:51 /wordoutput

[hadoop@node1 hadoop
-2.8.4]$ /main/hadoop-2.8.4/bin/hadoop fs -ls /wordoutput 18/06/28 16:52:49 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable Found 2 items -rw-r--r-- 3 hadoop supergroup 0 2018-06-28 16:51 /wordoutput/_SUCCESS -rw-r--r-- 3 hadoop supergroup 3231 2018-06-28 16:51 /wordoutput/part-r-00000
[hadoop@node1 hadoop
-2.8.4]$ /main/hadoop-2.8.4/bin/hadoop fs -ls /wordoutput/_SUCCESS 18/06/28 16:53:05 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable -rw-r--r-- 3 hadoop supergroup 0 2018-06-28 16:51 /wordoutput/_SUCCESS

可以看到_SUCCESS大小是0  所以内容在part-r-00000上,撸到本地看一眼:

[hadoop@node1 hadoop-2.8.4]$ /main/hadoop-2.8.4/bin/hadoop fs -ls /wordoutput/part-r-00000
18/06/28 16:54:43 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
-rw-r--r--   3 hadoop supergroup       3231 2018-06-28 16:51 /wordoutput/part-r-00000
[hadoop@node1 hadoop-2.8.4]$ /main/hadoop-2.8.4/bin/hadoop fs -get /wordoutput/part-r-00000 word_success.txt
18/06/28 16:54:57 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
[hadoop@node1 hadoop-2.8.4]$ vim word_success.txt 

生成了我们想要的结果

接下来试试kill掉NN master测试下HA:

注意:

  如果报错链接被拒绝,就要去确认之前另一个master namenode自身的服务是否监听正常,防火墙是否关闭,以及它的hosts文件与hdfs-site.xml文件中namenode的配置是否一致.

    如果配置文件中是主机名,那么每个节点都要在hosts中映射所有集群节点的ip,自身主机名要映射为内网ip,不能映射为127.0.0.1,否则会导致自身服务绑定在127.0.0.1:xxxx上,局域网其他节点无法链接!!!

  如果遇到问题 需要重新格式化NameNode,需要清理所有节点的老信息,否则会因为老的DataNode节点的namespaceID不同而无法正确启动:

rm -rf /main/hadoop-2.8.4/data/journal/*
rm -rf /main/hadoop-2.8.4/data/hdfs/namenode/*
rm -rf /main/hadoop-2.8.4/data/hdfs/datanode/*
rm -rf /main/hadoop-2.8.4/logs/*

  注意故障切换sshfence节点的端口如果ssh不是默认的22端口,需要设置为sshfence([[username][:port]])比如 sshfence(hadoop:2121)  否则active的namenode挂了之后,sshfence无法进去到另外一台机器去,导致无法自动切换主备. 

  安装若遇到问题,在/main/hadoop-2.8.4/logs目录下的各种*.log文件有详细的日志 

HBASE安装

HMaster没有单点问题,HBase中可以启动多个HMaster,通过Zookeeper的Master Election机制保证总有一个Master运行。 所以这里要配置HBase高可用的话,只需要启动两个HMaster,让Zookeeper自己去选择一个Master Acitve。

5台机器都解压到/main下:

tar -zvxf /main/soft/hbase-1.2.6.1-bin.tar.gz -C /main/

在Hadoop配置的基础上,配置环境变量HBASE_HOME、hbase-env.sh

vim   /etc/profile  设置如下:

#java  之前hadoop时候设置的
export JAVA_HOME=/main/jdk1.8.0_171
export JRE_HOME=$JAVA_HOME/jre
export CLASSPATH=$JAVA_HOME/lib  
export PATH=:$PATH:$JAVA_HOME/bin:$JRE_HOME/bin
#hbase
export HBASE_HOME=/main/hbase-1.2.6.1
export  PATH=$HBASE_HOME/bin:$PATH

HBASE的hbase-env.sh设置如下:

设置javahome   关闭Hbase自带的zk  使用我们自己安装的zk   同时设置ssh的端口

export JAVA_HOME=/main/jdk1.8.0_171

export HBASE_MANAGES_ZK=false

export HBASE_SSH_OPTS="-p 2121"

设置hbase-site.xml

官网有配置文件说明:

http://abloz.com/hbase/book.html

https://hbase.apache.org/2.0/book.html#example_config

https://hbase.apache.org/2.0/book.html#config.files

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
    <!--命名空间-->
    <property>
        <name>hbase.rootdir</name>
<!--这里要和hadoop的HDFS的servicename名字一致,否则会报错!--> <value>hdfs://hjbdfs/hbase</value> <description>The directory shared by RegionServers.</description> </property> <property> <name>hbase.cluster.distributed</name> <value>true</value> </property> <property> <name>hbase.master.port</name> <value>16000</value> <description>The port the HBase Master should bind to.</description> </property> <property> <name>hbase.zookeeper.quorum</name> <value>192.168.101.173,192.168.101.183,192.168.101.193</value> <description>逗号分割的zk服务器地址       Comma separated list of servers in the ZooKeeper Quorum. For example, "host1.mydomain.com,host2.mydomain.com,host3.mydomain.com". By default this is set to localhost for local and pseudo-distributed modes of operation. For a fully-distributed setup, this should be set to a full list of ZooKeeper quorum servers. If HBASE_MANAGES_ZK is set in hbase-env.sh this is the list of servers which we will start/stop ZooKeeper on. </description> </property> <property> <name>hbase.zookeeper.property.clientPort</name> <value>2181</value> <description>Property from ZooKeeper's config zoo.cfg.The port at which the clients will connect.</description> </property> <property> <name>hbase.zookeeper.property.dataDir</name> <value>/main/zookeeper/data</value> <description>zk配置文件zoo.cfg中的data目录地址 Property from ZooKeeper config zoo.cfg.The directory where the snapshot is stored.</description> </property> <property> <name>hbase.tmp.dir</name> <value>/main/hbase-1.2.6.1/hbase/tmp</value> </property> </configuration>

拷贝haddop的配置文件到HBASE目录下,关联二者:

[hadoop@node1 hbase-1.2.6.1]$ cp /main/hadoop-2.8.4/etc/hadoop/core-site.xml /main/hbase-1.2.6.1/conf/
[hadoop@node1 hbase-1.2.6.1]$ cp /main/hadoop-2.8.4/etc/hadoop/hdfs-site.xml /main/hbase-1.2.6.1/conf/

vim regionservers

node3
node4
node5

官网说明:

regionservers:  A plain-text file containing a list of hosts which should run a RegionServer in your HBase cluster. By default this file contains the single entry localhost. It should contain a list of hostnames or IP addresses, one per line, and should only contain localhost if each node in your cluster will run a RegionServer on its localhost interface.

vim  backup-masters

在node1的机器上写node2  这样的话,以后在node1上start集群,就会把node2作为备份的Master

同理可以在node2上写node1,这样以后可以同时在两个节点进行操作

node2

官网说明:

backup-masters:  Not present by default. A plain-text file which lists hosts on which the Master should start a backup Master process, one host per line.

启动HMaster  任意一个master节点均可:

[hadoop@node2 bin]$
/main/hbase-1.2.6.1/bin
/start-hbase.sh starting master, logging to /main/hbase-1.2.6.1/bin/../logs/hbase-hadoop-master-node2.out Java HotSpot(TM) 64-Bit Server VM warning: ignoring option PermSize=128m; support was removed in 8.0 Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=128m; support was removed in 8.0 node5: starting regionserver, logging to /main/hbase-1.2.6.1/bin/../logs/hbase-hadoop-regionserver-node5.out node3: starting regionserver, logging to /main/hbase-1.2.6.1/bin/../logs/hbase-hadoop-regionserver-node3.out node4: starting regionserver, logging to /main/hbase-1.2.6.1/bin/../logs/hbase-hadoop-regionserver-node4.out node3: Java HotSpot(TM) 64-Bit Server VM warning: ignoring option PermSize=128m; support was removed in 8.0 node3: Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=128m; support was removed in 8.0 node4: Java HotSpot(TM) 64-Bit Server VM warning: ignoring option PermSize=128m; support was removed in 8.0 node4: Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=128m; support was removed in 8.0 node5: Java HotSpot(TM) 64-Bit Server VM warning: ignoring option PermSize=128m; support was removed in 8.0 node5: Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=128m; support was removed in 8.0 node1: starting master, logging to /main/hbase-1.2.6.1/bin/../logs/hbase-hadoop-master-node1.out node1: Java HotSpot(TM) 64-Bit Server VM warning: ignoring option PermSize=128m; support was removed in 8.0 node1: Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=128m; support was removed in 8.0

在第一个HMaster的节点上执行:

[hadoop@node1 hbase-1.2.6.1]$ /main/hbase-1.2.6.1/bin/start-hbase.sh 
starting master, logging to /main/hbase-1.2.6.1/bin/../logs/hbase-hadoop-master-node1.out
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option PermSize=128m; support was removed in 8.0
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=128m; support was removed in 8.0
node3: starting regionserver, logging to /main/hbase-1.2.6.1/bin/../logs/hbase-hadoop-regionserver-node3.out
node4: starting regionserver, logging to /main/hbase-1.2.6.1/bin/../logs/hbase-hadoop-regionserver-node4.out
node5: starting regionserver, logging to /main/hbase-1.2.6.1/bin/../logs/hbase-hadoop-regionserver-node5.out
node3: Java HotSpot(TM) 64-Bit Server VM warning: ignoring option PermSize=128m; support was removed in 8.0
node3: Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=128m; support was removed in 8.0
node4: Java HotSpot(TM) 64-Bit Server VM warning: ignoring option PermSize=128m; support was removed in 8.0
node4: Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=128m; support was removed in 8.0
node5: Java HotSpot(TM) 64-Bit Server VM warning: ignoring option PermSize=128m; support was removed in 8.0
node5: Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=128m; support was removed in 8.0

会根据配置文件自动去SSH到RegionServer进行启动,而自身节点如果不是RegionServer,启动完也不会有变化.

现在开始进入第二个HMaster节点,手动再启动一个Hmaster

RegionServer上启动前后的进程变化:

[hadoop@node3 hbase-1.2.6.1]$ jps
7460 DataNode
8247 Jps
7562 JournalNode
7660 NodeManager
[hadoop@node3 hbase-1.2.6.1]$ vim conf/hbase-env.sh 
[hadoop@node3 hbase-1.2.6.1]$ jps
7460 DataNode
8408 Jps
7562 JournalNode
8300 HRegionServer
7660 NodeManager

 从节点是node1:

主节点是node2 :

我们去node2 kill掉进程,测试master的灾备: 

[hadoop@node2 bin]$ jps
3809 Jps
1412 NameNode
3607 HMaster
1529 DFSZKFailoverController
[hadoop@node2 bin]$ kill 3607
[hadoop@node2 bin]$ jps
3891 Jps
1412 NameNode
1529 DFSZKFailoverController

 node1成功变成了主节点:

可以用./hbase-daemon.sh start master命令再次启动挂掉的master

以上安装完毕之后,node1  node2:

[hadoop@node1 hbase-1.2.6.1]$ jps
31458 NameNode
31779 DFSZKFailoverController
5768 Jps
5482 HMaster
31871 ResourceManager

node3-5:

[hadoop@node3 hbase-1.2.6.1]$ jps
9824 Jps
9616 HRegionServer
7460 DataNode
7562 JournalNode
7660 NodeManager

 Spark的安装:

在另外一堆集群上,也如上安装了hadoop  

192.168.210.114 node1
192.168.210.115 node2
192.168.210.116 node3
192.168.210.117 node4
192.168.210.134 node5
192.168.210.135 node6
192.168.210.136 node7
192.168.210.137 node8

spark安装在1-5上   spark依赖于hadoop:

所以下载的时候需要注意http://spark.apache.org/downloads.html  选择和hadoop兼容的版本:

同时需要下载其对应的scala版本:   去http://spark.apache.org/docs/latest/ 看看读应的scala版本

所以我们下载了scala 2.11.11  解压到/main下设置好 PATH即可.

下载spark-2.3.1-bin-hadoop2.7.tgz解压到/main后:

编辑  spark-env.sh,根据实际情况设置:

#!/usr/bin/env bash

#
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements.  See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License.  You may obtain a copy of the License at
#
#    http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#

# This file is sourced when running various Spark programs.
# Copy it as spark-env.sh and edit that to configure Spark for your site.

# Options read when launching programs locally with
# ./bin/run-example or ./bin/spark-submit
# - HADOOP_CONF_DIR, to point Spark towards Hadoop configuration files
# - SPARK_LOCAL_IP, to set the IP address Spark binds to on this node
# - SPARK_PUBLIC_DNS, to set the public dns name of the driver program

# Options read by executors and drivers running inside the cluster
# - SPARK_LOCAL_IP, to set the IP address Spark binds to on this node
# - SPARK_PUBLIC_DNS, to set the public DNS name of the driver program
# - SPARK_LOCAL_DIRS, storage directories to use on this node for shuffle and RDD data
# - MESOS_NATIVE_JAVA_LIBRARY, to point to your libmesos.so if you use Mesos

# Options read in YARN client/cluster mode
# - SPARK_CONF_DIR, Alternate conf dir. (Default: ${SPARK_HOME}/conf)
# - HADOOP_CONF_DIR, to point Spark towards Hadoop configuration files
# - YARN_CONF_DIR, to point Spark towards YARN configuration files when you use YARN
# - SPARK_EXECUTOR_CORES, Number of cores for the executors (Default: 1).
# - SPARK_EXECUTOR_MEMORY, Memory per Executor (e.g. 1000M, 2G) (Default: 1G)
# - SPARK_DRIVER_MEMORY, Memory for Driver (e.g. 1000M, 2G) (Default: 1G)

# Options for the daemons used in the standalone deploy mode
# - SPARK_MASTER_HOST, to bind the master to a different IP address or hostname
# - SPARK_MASTER_PORT / SPARK_MASTER_WEBUI_PORT, to use non-default ports for the master
# - SPARK_MASTER_OPTS, to set config properties only for the master (e.g. "-Dx=y")
# - SPARK_WORKER_CORES, to set the number of cores to use on this machine
# - SPARK_WORKER_MEMORY, to set how much total memory workers have to give executors (e.g. 1000m, 2g)
# - SPARK_WORKER_PORT / SPARK_WORKER_WEBUI_PORT, to use non-default ports for the worker
# - SPARK_WORKER_DIR, to set the working directory of worker processes
# - SPARK_WORKER_OPTS, to set config properties only for the worker (e.g. "-Dx=y")
# - SPARK_DAEMON_MEMORY, to allocate to the master, worker and history server themselves (default: 1g).
# - SPARK_HISTORY_OPTS, to set config properties only for the history server (e.g. "-Dx=y")
# - SPARK_SHUFFLE_OPTS, to set config properties only for the external shuffle service (e.g. "-Dx=y")
# - SPARK_DAEMON_JAVA_OPTS, to set config properties for all daemons (e.g. "-Dx=y")
# - SPARK_DAEMON_CLASSPATH, to set the classpath for all daemons
# - SPARK_PUBLIC_DNS, to set the public dns name of the master or workers

# Generic options for the daemons used in the standalone deploy mode
# - SPARK_CONF_DIR      Alternate conf dir. (Default: ${SPARK_HOME}/conf)
# - SPARK_LOG_DIR       Where log files are stored.  (Default: ${SPARK_HOME}/logs)
# - SPARK_PID_DIR       Where the pid file is stored. (Default: /tmp)
# - SPARK_IDENT_STRING  A string representing this instance of spark. (Default: $USER)
# - SPARK_NICENESS      The scheduling priority for daemons. (Default: 0)
# - SPARK_NO_DAEMONIZE  Run the proposed command in the foreground. It will not output a PID file.
# Options for native BLAS, like Intel MKL, OpenBLAS, and so on.
# You might get better performance to enable these options if using native BLAS (see SPARK-21305).
# - MKL_NUM_THREADS=1        Disable multi-threading of Intel MKL
# - OPENBLAS_NUM_THREADS=1   Disable multi-threading of OpenBLAS
export JAVA_HOME=/main/server/jdk1.8.0_11
export SCALA_HOME=/main/scala-2.11.11
export HADOOP_HOME=/main/hadoop-2.8.4
export HADOOP_CONF_DIR=/main/hadoop-2.8.4/etc/hadoop
export SPARK_WORKER_MEMORY=4g
export SPARK_EXECUTOR_MEMORY=4g
export SPARK_DRIVER_MEMORY=4G
export SPARK_WORKER_CORES=2
export SPARK_DAEMON_JAVA_OPTS="-Dspark.deploy.recoveryMode=ZOOKEEPER -Dspark.deploy.zookeeper.url=192.168.210.38:2181,192.168.210.58:2181,192.168.210.78:2181 -Dspark.deploy.zookeeper.dir=/spark"
export SPARK_SSH_OPTS="-p 2121"

vim  slaves

node1
node2
node3
node4
node5

然后即可启动,先启动所有的slaves,然后手动启动两个master:

start-all会自动把自己作为master启动,然后再去启动slaves文件中所有的worker

[hadoop@node1 conf]$ /main/spark-2.3.1-bin-hadoop2.7/sbin/start-all.sh 
starting org.apache.spark.deploy.master.Master, logging to /main/spark-2.3.1-bin-hadoop2.7/logs/spark-hadoop-org.apache.spark.deploy.master.Master-1-node1.out
node2: starting org.apache.spark.deploy.worker.Worker, logging to /main/spark-2.3.1-bin-hadoop2.7/logs/spark-hadoop-org.apache.spark.deploy.worker.Worker-1-node2.out
node1: starting org.apache.spark.deploy.worker.Worker, logging to /main/spark-2.3.1-bin-hadoop2.7/logs/spark-hadoop-org.apache.spark.deploy.worker.Worker-1-node1.out
node3: starting org.apache.spark.deploy.worker.Worker, logging to /main/spark-2.3.1-bin-hadoop2.7/logs/spark-hadoop-org.apache.spark.deploy.worker.Worker-1-node3.out
node4: starting org.apache.spark.deploy.worker.Worker, logging to /main/spark-2.3.1-bin-hadoop2.7/logs/spark-hadoop-org.apache.spark.deploy.worker.Worker-1-node4.out
node5: starting org.apache.spark.deploy.worker.Worker, logging to /main/spark-2.3.1-bin-hadoop2.7/logs/spark-hadoop-org.apache.spark.deploy.worker.Worker-1-node5.out
[hadoop@node1 conf]$ jps
11299 Master
11411 Worker
5864 NameNode
6184 DFSZKFailoverController
11802 Jps
6301 ResourceManager
6926 HMaster

 其他机器都有worker进程,然后为了HA我们再去node2启动一个Spark的Master

[hadoop@node2 conf]$ jps
5536 Jps
2209 DFSZKFailoverController
2104 NameNode
2602 HMaster
5486 Worker
[hadoop@node2 conf]$ /main/spark-2.3.1-bin-hadoop2.7/sbin/start-master.sh 
starting org.apache.spark.deploy.master.Master, logging to /main/spark-2.3.1-bin-hadoop2.7/logs/spark-hadoop-org.apache.spark.deploy.master.Master-1-node2.out
[hadoop@node2 conf]$ jps
5568 Master
2209 DFSZKFailoverController
2104 NameNode
2602 HMaster
5486 Worker
5631 Jps

去看看两个master:

 另一个是空的:

Storm安装:

先抄一张图:

下载apache-storm-1.2.2.tar.gz  解压到node6-8节点/main下,设置:

export STORM_HOME=/main/apache-storm-1.2.2

 vim storm.yaml 如下,3台相同:

 storm.zookeeper.servers:
     - "192.168.210.38"
     - "192.168.210.58"
     - "192.168.210.78"
 storm.zookeeper.port: 2181
 storm.local.dir: "/main/apache-storm-1.2.2/data"
 nimbus.seeds: ["192.168.210.135"]
 supervisor.slots.ports:
     - 6700
     - 6701
     - 6702
     - 6703

 storm.health.check.dir: "healthchecks"
 storm.health.check.timeout.ms: 5000

其中storm.local.dir指定的目录需要提前创建,supervisor.slots.ports配置的端口数量决定了每台supervisor机器的worker数量,每个worker会有自己的监听端口用于监听任务。

创建目录:

mkdir  /main/apache-storm-1.2.2/data

主控节点(可以是多个,见nimbus.seeds: ["192.168.210.135"] 配置)

先启动主控:

# 启动主控
nohup /main/apache-storm-1.2.2/bin/storm nimbus &
# 启动主控 ui
nohup /main/apache-storm-1.2.2/bin/storm ui &
# 启动supervisor
nohup /main/apache-storm-1.2.2/bin/storm supervisor &

最后在另外两台机器启动supervisor:

nohup /main/apache-storm-1.2.2/bin/storm supervisor &

 主控节点的ui页面可以看到所有supervisor信息

然后从github下载storm源码(或者从storm的apache主页下载src的zip解压),切换到对应的分支,如1.1.x或1.x或2.x.   目前2.x还是快照版,我们安装的是1.2.2,切换分支到1.x之后,

特别是github下载的源码,一定要执行一遍:

mvn clean install -DskipTests

因为github上的代码可能更新,相关的jar并没有发布到maven仓库,所以上面的命令就编译了一堆storm依赖包到本地mavan仓库.    

ps:上述过程不要使用maven第三方镜像,否则很可能会出错!

我们切换到1.x后发现最新的是storm-starter-1.2.3-SNAPSHOT版本,测试一下服务器的1.2.2是否兼容:

把storm-starter-1.2.3-SNAPSHOT.jar传到服务器,运行一下demo试试:

[hadoop@node6 main]$ /main/apache-storm-1.2.2/bin/storm jar storm-starter-1.2.3-SNAPSHOT.jar org.apache.storm.starter.WordCountTopology word-count
Running: /main/server/jdk1.8.0_11/bin/java -client -Ddaemon.name= -Dstorm.options= -Dstorm.home=/main/apache-storm-1.2.2 -Dstorm.log.dir=/main/apache-storm-1.2.2/logs -Djava.library.path=/usr/local/lib:/opt/local/lib:/usr/lib -Dstorm.conf.file= -cp /main/apache-storm-1.2.2/*:/main/apache-storm-1.2.2/lib/*:/main/apache-storm-1.2.2/extlib/*:storm-starter-1.2.3-SNAPSHOT.jar:/main/apache-storm-1.2.2/conf:/main/apache-storm-1.2.2/bin -Dstorm.jar=storm-starter-1.2.3-SNAPSHOT.jar -Dstorm.dependency.jars= -Dstorm.dependency.artifacts={} org.apache.storm.starter.WordCountTopology word-count
955  [main] WARN  o.a.s.u.Utils - STORM-VERSION new 1.2.2 old null
991  [main] INFO  o.a.s.StormSubmitter - Generated ZooKeeper secret payload for MD5-digest: -7683379793985025786:-5178094576454792625
1122 [main] INFO  o.a.s.u.NimbusClient - Found leader nimbus : node6:6627
1182 [main] INFO  o.a.s.s.a.AuthUtils - Got AutoCreds []
1190 [main] INFO  o.a.s.u.NimbusClient - Found leader nimbus : node6:6627
1250 [main] INFO  o.a.s.StormSubmitter - Uploading dependencies - jars...
1251 [main] INFO  o.a.s.StormSubmitter - Uploading dependencies - artifacts...
1251 [main] INFO  o.a.s.StormSubmitter - Dependency Blob keys - jars : [] / artifacts : []
1289 [main] INFO  o.a.s.StormSubmitter - Uploading topology jar storm-starter-1.2.3-SNAPSHOT.jar to assigned location: /main/apache-storm-1.2.2/data/nimbus/inbox/stormjar-5ed677a5-9af1-4b5e-8467-83f637e00506.jar
Start uploading file 'storm-starter-1.2.3-SNAPSHOT.jar' to '/main/apache-storm-1.2.2/data/nimbus/inbox/stormjar-5ed677a5-9af1-4b5e-8467-83f637e00506.jar' (106526828 bytes)
[==================================================] 106526828 / 106526828
File 'storm-starter-1.2.3-SNAPSHOT.jar' uploaded to '/main/apache-storm-1.2.2/data/nimbus/inbox/stormjar-5ed677a5-9af1-4b5e-8467-83f637e00506.jar' (106526828 bytes)
2876 [main] INFO  o.a.s.StormSubmitter - Successfully uploaded topology jar to assigned location: /main/apache-storm-1.2.2/data/nimbus/inbox/stormjar-5ed677a5-9af1-4b5e-8467-83f637e00506.jar
2876 [main] INFO  o.a.s.StormSubmitter - Submitting topology word-count in distributed mode with conf {"storm.zookeeper.topology.auth.scheme":"digest","storm.zookeeper.topology.auth.payload":"-7683379793985025786:-5178094576454792625","topology.workers":3,"topology.debug":true}
2876 [main] WARN  o.a.s.u.Utils - STORM-VERSION new 1.2.2 old 1.2.2
4091 [main] INFO  o.a.s.StormSubmitter - Finished submitting topology: word-count

显示提交人物完毕,接下来看看UI:

也可以停止这个Topology:

[hadoop@node6 main]$ /main/apache-storm-1.2.2/bin/storm kill word-count
Running: /main/server/jdk1.8.0_11/bin/java -client -Ddaemon.name= -Dstorm.options= -Dstorm.home=/main/apache-storm-1.2.2 -Dstorm.log.dir=/main/apache-storm-1.2.2/logs -Djava.library.path=/usr/local/lib:/opt/local/lib:/usr/lib -Dstorm.conf.file= -cp /main/apache-storm-1.2.2/*:/main/apache-storm-1.2.2/lib/*:/main/apache-storm-1.2.2/extlib/*:/main/apache-storm-1.2.2/extlib-daemon/*:/main/apache-storm-1.2.2/conf:/main/apache-storm-1.2.2/bin org.apache.storm.command.kill_topology word-count
原文地址:https://www.cnblogs.com/radio/p/9233502.html