Hadoop in Action 第二章续2(分布式部署)

之前持续关注分布式存储和分布式计算,现在是建立完整集群的时候了.在这一节,我们将使用下面的服务器名:

1. master--- 主节点,主要用来运行NameNode和JobTracker服务.

2. backup --- 用来运行Secondary NameNode服务.

3. hadoop1,hadoop2,hadoop3…---运行DataNode和TaskTracker的从节点.

修改之前伪分布式的骨架来配置这个分布式模式.

复制所有这几个配置文件到所有的从服务器上,并且保证所有的从服务器的hdfs都格式化了.

hdfs-site.xml

<?xml version=”1.0”?>
<?xml-stylesheet type=”text/xsl” href=”conﬁguration.xsl”?>

<conﬁguration>
<property>
<name>dfs.replication</name>
<value>3</value>
<description>The actual number of replications can be speciﬁ ed when the
ﬁle is created.</description>
</property>
</conﬁguration>

core-site.xml

<?xml version=”1.0”?>
<?xml-stylesheet type=”text/xsl” href=”conﬁ guration.xsl”?>

<conﬁ guration>
<property>
<name>fs.default.name</name>
<value>hdfs://master:9000</value>
<description>The name of the default ﬁ le system. A URI whose
scheme and authority determine the FileSystem implementation.
</description>
</property>
</conﬁ guration>

mapred-site.xml

<?xml version=”1.0”?>
<?xml-stylesheet type=”text/xsl” href=”conﬁguration.xsl”?>

<conﬁguration>
<property>
<name>mapred.job.tracker</name>
<value>master:9001</value>
<description>The host and port that the MapReduce job tracker runs
at.</description>
</property>
</conﬁguration>

此配置的重点是master服务器的名称一定要确保正确。