docker中搭建分布式hadoop集群

1、pull Ubuntu镜像配置Java环境

2、下载hadoop软件包, 配置hosts /etc/hosts 

172.17.0.5 hadoop1
172.17.0.6 hadoop2
172.17.0.2 hadoop3

3、配置JAVA_HOME(hadoop-env.sh、mapred-env.sh、yarn-env.sh)

4、配置core-site.xml

<configuration>
    <property>
        <name>fs.defaultFS</name>
        <value>hdfs://hadoop1:8020</value>
    </property>
    <property>
        <name>hadoop.tmp.dir</name>
        <value>/home/root/data/tmp</value>
    </property>
</configuration>

5、配置hdfs-site.xml

<configuration>
    <property>
        <name>dfs.namenode.secondary.http-address</name>
        <value>hadoop3:50090</value>
    </property>
</configuration>

6、配置slave

hadoop1
hadoop2
hadoop3

7、配置yarn-site.xml

<configuration>
    <property>
        <name>yarn.nodemanager.aux-services</name>
        <value>mapreduce_shuffle</value>
    </property>
    <property>
        <name>yarn.resourcemanager.hostname</name>
        <value>hadoop2</value>
    </property>
    <property>
        <name>yarn.log-aggregation-enable</name>
        <value>true</value>
    </property>
    <property>
        <name>yarn.log-aggregation.retain-seconds</name>
        <value>106800</value>
    </property>
</configuration>

8、配置mapred-site.xml

cp mapred-site.xml.template mapred-site.xml
<configuration>
    <property>
        <name>mapreduce.framework.name</name>
        <value>yarn</value>
    </property>
    <property>
        <name>mapreduce.jobhistory.address</name>
        <value>hadoop1:10020</value>
    </property>
    <property>
        <name>mapreduce.jobhistory.webapp.address</name>
        <value>hadoop1:19888</value>
    </property>
</configuration>

9、设置ssh登录

安装sshd

apt-get install openssh-server
service ssh start
ps -e | grep ssh

生成秘钥

ssh-keygen -t rsa

设置root密码

passwd

设置root远程登录 PermitRootLogin yes 

vim /etc/ssh/sshd_config
/etc/init.d/ssh restart

分发公钥

ssh-copy-id hadoop1
ssh-copy-id hadoop2
ssh-copy-id hadoop3

NameNode执行格式化

hdfs namenode –format

 hadoop1上启动HDFS集群

/sbin/start-dfs.sh

启动出错

The authenticity of host '127.17.0.2 (127.17.0.2)' can't be established.
Host key verification failed.
vi /etc/ssh/ssh_config

修改 StrictHostKeyChecking no 

 hadoop1 上启动yarn

sbin/start-yarn.sh

  hadoop2  上启动ResourceManager

sbin/yarn-daemon.sh start resourcemanager
原文地址:https://www.cnblogs.com/csig/p/9975195.html