hadoop安装

1、安装java:

[hadoop@typhoeus79 hadoop]$ java -version
java version "1.6.0_24"
Java(TM) SE Runtime Environment (build 1.6.0_24-b07)
Java HotSpot(TM) 64-Bit Server VM (build 19.1-b02, mixed mode)

[hadoop@typhoeus79 hadoop]$ echo $JAVA_HOME
/usr/java/jdk1.6.0_30

2、配置SSH

建立主从之间的信任关系

3、在三台主机上分别设置/etc/hosts

# hadoop
10.XXX.XXX.XX    typhoeus79
10.XXX.XXX.XX    typhoeus80
10.XXX.XXX.XX    typhoeus81
10.XXX.XXX.XX    typhoeus82

4、搭建hadoop

http://hadoop.apache.org/releases.html#Download

4.1、下载并修改权限

tar -zxvf hadoop-1.2.1.tar.gz

mv hadoop-1.2.1 hadoop

chown -R hadoop.hadoop ./hadoop

4.2、修改core-site.xml

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!-- Put site-specific property overrides in this file. -->

<configuration>
    <property>
        <name>hadoop.tmp.dir</name>
        <value>/data1/guosong/hadoop/tmp</value>
    </property>

    <property>
        <name>fs.default.name</name>
        <value>hdfs://10.xxx.xxx.xxx:6001</value>
    </property>

</configuration>

4.3、修改hdfs-site.xml

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!-- Put site-specific property overrides in this file. -->

<configuration>
    <property>
        <name>dfs.replication</name>
        <value>3</value>
    </property>

    <property>
        <name>dfs.name.dir</name>
        <value>/data1/guosong/hadoop/name</value>
        <final>true</final>
    </property>

    <property>
        <name>dfs.data.dir</name>
        <value>/data1/guosong/hadoop/data</value>
        <final>true</final>
    </property>

</configuration>         

4.4修改mapred-site.xml

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!-- Put site-specific property overrides in this file. -->

<configuration>
    <property>
        <name>mapred.job.tracker</name>
        <value>10.xxx.xxx.xxx:6002</value> 
</property>
</configuration>

5、启动hadoop

./bin/hadoop namenode -format(必须)
./bin/start-all.sh

6、校验(查看集群状态)

[hadoop@typhoeus79 hadoop]$ ./bin/hadoop dfsadmin -report
Warning: $HADOOP_HOME is deprecated.

Configured Capacity: 1155636670464 (1.05 TB)
Present Capacity: 745935544320 (694.71 GB)
DFS Remaining: 745935421440 (694.71 GB)
DFS Used: 122880 (120 KB)
DFS Used%: 0%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0

-------------------------------------------------
Datanodes available: 3 (3 total, 0 dead)

Name: 10.xxx.xxx.xxx:50010
Decommission Status : Normal
Configured Capacity: 383154757632 (356.84 GB)
DFS Used: 40960 (40 KB)
Non DFS Used: 100126064640 (93.25 GB)
DFS Remaining: 283028652032(263.59 GB)
DFS Used%: 0%
DFS Remaining%: 73.87%
Last contact: Fri Dec 20 19:38:29 CST 2013


Name: 10.xxx.xxx.xxx:50010
Decommission Status : Normal
Configured Capacity: 389327155200 (362.59 GB)
DFS Used: 40960 (40 KB)
Non DFS Used: 221013901312 (205.84 GB)
DFS Remaining: 168313212928(156.75 GB)
DFS Used%: 0%
DFS Remaining%: 43.23%
Last contact: Fri Dec 20 19:38:29 CST 2013


Name: 10.xxx.xxx.xxx:50010
Decommission Status : Normal
Configured Capacity: 383154757632 (356.84 GB)
DFS Used: 40960 (40 KB)
Non DFS Used: 88561160192 (82.48 GB)
DFS Remaining: 294593556480(274.36 GB)
DFS Used%: 0%
DFS Remaining%: 76.89%
Last contact: Fri Dec 20 19:38:28 CST 2013

主要问题:

1、ssh配置,由于公司内部禁用这个,改起来比较麻烦

2、端口在使用,查看logs(netstat -tln|grep进行检查)

3、启动之前没有进行namenode格式化,

./bin/hadoop namenode -format
原文地址:https://www.cnblogs.com/gsblog/p/3484412.html