ubuntu14.0 hadoop2.4.0 64位基于jdk1.7搭建

注意:hadoop有两种运行模式,安全模式和非安全模式。安装模式是以指定在健壮的,基于身份验证上运行的,本文无需运行在非安全模式下,可以直接使用root用户。

本文用户是基于root用户来运行的

一、网络配置

打开终端输入ifconfig,找到本机ip,192.168.209.139

/etc/hosts存放的是域名与ip的对应关系,

gedit /etc/hosts

192.168.209.139 master
192.168.209.140 slave1
192.168.209.141 slave2
192.168.209.142 slave3

分别修改机器名称:
gedit /etc/hostname 分别修改为master、slave1、slave2、slave3

二、打通机器

各个节点安装ssh 

sudo apt-get install openssh-server

启动服务 sudo /etc/init.d/ssh start
查看服务 ps -e|grep ssh

主节点操作:

cd /root/.ssh/

ssh-keygen -t rsa -P "" -P指定端口号
cat id_rsa.pub > ~/.ssh/authorized_keys #把前面文件的内容追加到后面的文件里面,没有后面的文件就创建
scp ~/.ssh/authorized_keys slave1:~/.ssh/authourized_keys #拷贝到子节点
scp ~/.ssh/authorized_keys slave2:~/.ssh/authourized_keys
scp ~/.ssh/authorized_keys slave3:~/.ssh/authourized_keys

ssh master 登录测试无密码

三、安装jdk

官网下载jdk1.7版本

tar -zxvf jdk-7u7-linux-i586.tar.gz

mv jdk1.7.0_07 /usr/local/jdk7

设置环境变量 gedit /etc/profile

#set java envirenment
export JAVA_HOME=/usr/local/jdk7 #解压的路径
export JRE_HOME=${JAVA_HOME}/jre
export CLASSPATH=.:${JAVA_HOME}/lib:${JRE_HOME}/lib
export PATH=${JAVA_HOME}/bin:$PATH

# set hadoop
export HADOOP_HOME=/usr/local/hadoop
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export HADOOP_OPTS="$HADOOP_OPTS -Djava.library.path=$HADOOP_HOME/lib/native"

#export PATH="$HADOOP_HOME/bin:$HIVE_HOME/bin:$PATH"
export CLASSPATH=$JAVA_HOME/lib:$JRE_HOME/lib:$CLASSPATH
export PATH=$JAVA_HOME/bin:$JRE_HOME/bin:$HADOOP_HOME/bin:$HIVE_HOME/bin:$PATH

使环境变量生效 source /etc/profile

四、安装,配置hadoop

官网下载的是32位包,主要自己编译成64位,所以最好网上找64位以编译好的包

tar -zxvf hadoop-2.4.0-64bit.tar.gz
mv hadoop-2.4.0 /usr/local/hadoop

配置文件在cd /usr/local/hadoop/etc/hadoop/ 目录下
1、在yarn-env.sh 和hadoop-env.sh文件中加上jdk路径

gedit yarn-env.sh

gedit hadoop-env.sh
export JAVA_HOME=/usr/local/jdk7
2、gedit core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://master:9000</value>
<final>true</final>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/usr/local/hadoop/tmp/hadoop-${user.name}</value>
</property>
<property>
<name>io.file.buffer.size</name>
<value>131072</value>
</property>
</configuration>
3、gedit hdfs-site.xml
<configuration>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/usr/local/hadoop/dfs/name</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/usr/local/hadoop/dfs/data</value>
</property>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
</configuration>
4、gedit mapred-site.xml.template
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
<description>Execution framework set to Hadoop YARN.</description>
</property>
</configuration>
5、gedit yarn-site.xml
<configuration>

<!-- Site specific YARN configuration properties -->

<property>
<name>yarn.resourcemanager.address</name>
<value>master:9001</value>
<description>The address of the applications manager interface in the RM.</description>
</property>

<property>
<name>yarn.resourcemanager.hostname</name>
<value>master</value>
<description>The address of the applications manager interface in the RM.</description>
</property>

<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>master:18030</value>
<description>The address of the scheduler interface,in order for the RM to obtain the resource from scheduler</description>
</property>

<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>master:18025</value>
<description>The address of the resource tracker interface for the nodeManagers</description>
</property>

<property>
<name>yarn.resourcemanager.admin.address</name>
<value>master:18035</value>
<description>The address for admin manager</description>
</property>

<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>master:18088</value>
<description>The address of the RM web application.</description>
</property>

<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
</configuration>
5、gedit slaves
在该文件中添加
slave1
slave2
slave3

6、将hadoop文件夹和java 拷贝到另外三台,同时把环境变量里面的配置也一并进行拷贝

7、主节点启动hadoop

[初始化]
cd $HADOOP_HOME/bin
./hadoop namenode -format
启动

cd $HADOOP_HOME
sbin/start-all.sh

停止

sbin/stop-all.sh

jps 查看进程
hadoop dfsadmin -report 查看集群信息

原文地址:https://www.cnblogs.com/zhang-ke/p/5639254.html