Centos编译Hadoop 2.x 源码

 1. 前言

Hadoop-2.4.0的源码目录下有个BUILDING.txt文件,它介绍了如何在Linux和Windows下编译源代码,本文基本是遵照BUILDING.txt指示来操作的,这里再做一下简单的提炼。
第一次编译要求能够访问互联网,Hadoop的编译依赖非常多的东西,一定要保证机器可访问互联网,否则难逐一解决所有的编译问题,但第一次之后的编译则不用再下载了。
如不能上网可以参考:虚拟机三种网络模式该如何上网指导
2. 安装依赖
在编译Hadoop 2.4.0源码之前,需要将下列几个依赖的东西安装好:
1) JDK 1.6或更新版本(本文使用JDK1.7,请不要安装JDK1.8版本,JDK1.8和Hadoop 2.4.0不匹配,编译Hadoop 2.4.0源码时会报很多错误)
2) Maven 3.0或更新版本
3) ProtocolBuffer 2.5.0
4) CMake 2.6或更新版本
5) Findbugs 1.3.9,可选的(本文编译时未安装)

在安装好之后,还需要设置一下环境变量,可以修改/etc/profile,也可以是修改~/.profile,增加如下内容:
export JAVA_HOME=/root/jdk
export CLASSPATH=$JAVA_HOME/lib/tools.jar
export PATH=$JAVA_HOME/bin:$PATH

export CMAKE_HOME=/root/cmake
export PATH=$CMAKE_HOME/bin:$PATH

export PROTOC_HOME=/root/protobuf
export PATH=$PROTOC_HOME/bin:$PATH

export MAVEN_HOME=/root/maven
export PATH=$MAVEN_HOME/bin:$PATH

本文以root用户在/root目录下进行安装,但实际可以选择非root用户及非/root目录进行安装。

2.1. 安装ProtocolBuffer
标准的automake编译安装方式:
1) cd /root
2) tar xzf protobuf-2.5.0.tar.gz
3) cd protobuf-2.5.0
4) ./conigure --prefix=/root/protobuf
5) make
6) make install

2.2. 安装CMake
1) cd /root
2) tar xzf cmake-2.8.12.2.tar.gz
3) cd cmake-2.8.12.2
4) ./bootstrap --prefix=/root/cmake
5) make
6) make install

2.3. 安装JDK
1) cd /root
2) tar xzf jdk-7u55-linux-x64.gz
3) cd jdk1.7.0_55
4) ln -s jdk1.7.0_55 jdk

2.4. 安装Maven
1) cd /root
2) tar xzf apache-maven-3.0.5-bin.tar.gz
3) ln -s apache-maven-3.0.5 maven


3. 编译Hadoop源代码
完成上述准备工作后,即可通过执行命令:mvn package -Pdist -DskipTests -Dtar,启动对Hadoop源代码的编译。请注意一定不要使用JDK1.8。

编译成功后,jar文件会放在target子目录下,可以在Hadoop源码目录下借用find命令搜索各个target子目录。
编译成功后,会生成Hadoop二进制安装包hadoop-2.4.0.tar.gz,放在源代码的hadoop-dist/target子目录下:
main:
     [exec] $ tar cf hadoop-2.4.0.tar hadoop-2.4.0
     [exec] $ gzip -f hadoop-2.4.0.tar
     [exec] 
     [exec] Hadoop dist tar available at: /root/hadoop-2.4.0-src/hadoop-dist/target/hadoop-2.4.0.tar.gz
     [exec] 
[INFO] Executed tasks
[INFO] 
[INFO] --- maven-javadoc-plugin:2.8.1:jar (module-javadocs) @ hadoop-dist ---
[INFO] Building jar: /root/hadoop-2.4.0-src/hadoop-dist/target/hadoop-dist-2.4.0-javadoc.jar
[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary:
[INFO] 
[INFO] Apache Hadoop Main ................................ SUCCESS [4.647s]
[INFO] Apache Hadoop Project POM ......................... SUCCESS [5.352s]
[INFO] Apache Hadoop Annotations ......................... SUCCESS [7.239s]
[INFO] Apache Hadoop Assemblies .......................... SUCCESS [0.424s]
[INFO] Apache Hadoop Project Dist POM .................... SUCCESS [2.918s]
[INFO] Apache Hadoop Maven Plugins ....................... SUCCESS [6.261s]
[INFO] Apache Hadoop MiniKDC ............................. SUCCESS [5.321s]
[INFO] Apache Hadoop Auth ................................ SUCCESS [5.953s]
[INFO] Apache Hadoop Auth Examples ....................... SUCCESS [3.783s]
[INFO] Apache Hadoop Common .............................. SUCCESS [1:54.010s]
[INFO] Apache Hadoop NFS ................................. SUCCESS [9.721s]
[INFO] Apache Hadoop Common Project ...................... SUCCESS [0.048s]
[INFO] Apache Hadoop HDFS ................................ SUCCESS [4:15.270s]
[INFO] Apache Hadoop HttpFS .............................. SUCCESS [6:18.553s]
[INFO] Apache Hadoop HDFS BookKeeper Journal ............. SUCCESS [16.237s]
[INFO] Apache Hadoop HDFS-NFS ............................ SUCCESS [6.543s]
[INFO] Apache Hadoop HDFS Project ........................ SUCCESS [0.036s]
[INFO] hadoop-yarn ....................................... SUCCESS [0.051s]
[INFO] hadoop-yarn-api ................................... SUCCESS [1:35.227s]
[INFO] hadoop-yarn-common ................................ SUCCESS [43.216s]
[INFO] hadoop-yarn-server ................................ SUCCESS [0.055s]
[INFO] hadoop-yarn-server-common ......................... SUCCESS [16.476s]
[INFO] hadoop-yarn-server-nodemanager .................... SUCCESS [19.942s]
[INFO] hadoop-yarn-server-web-proxy ...................... SUCCESS [4.926s]
[INFO] hadoop-yarn-server-applicationhistoryservice ...... SUCCESS [9.804s]
[INFO] hadoop-yarn-server-resourcemanager ................ SUCCESS [23.320s]
[INFO] hadoop-yarn-server-tests .......................... SUCCESS [1.208s]
[INFO] hadoop-yarn-client ................................ SUCCESS [9.177s]
[INFO] hadoop-yarn-applications .......................... SUCCESS [0.113s]
[INFO] hadoop-yarn-applications-distributedshell ......... SUCCESS [4.106s]
[INFO] hadoop-yarn-applications-unmanaged-am-launcher .... SUCCESS [3.265s]
[INFO] hadoop-yarn-site .................................. SUCCESS [0.056s]
[INFO] hadoop-yarn-project ............................... SUCCESS [5.552s]
[INFO] hadoop-mapreduce-client ........................... SUCCESS [0.096s]
[INFO] hadoop-mapreduce-client-core ...................... SUCCESS [37.231s]
[INFO] hadoop-mapreduce-client-common .................... SUCCESS [27.135s]
[INFO] hadoop-mapreduce-client-shuffle ................... SUCCESS [4.886s]
[INFO] hadoop-mapreduce-client-app ....................... SUCCESS [17.876s]
[INFO] hadoop-mapreduce-client-hs ........................ SUCCESS [14.140s]
[INFO] hadoop-mapreduce-client-jobclient ................. SUCCESS [11.305s]
[INFO] hadoop-mapreduce-client-hs-plugins ................ SUCCESS [3.083s]
[INFO] Apache Hadoop MapReduce Examples .................. SUCCESS [9.855s]
[INFO] hadoop-mapreduce .................................. SUCCESS [5.110s]
[INFO] Apache Hadoop MapReduce Streaming ................. SUCCESS [7.778s]
[INFO] Apache Hadoop Distributed Copy .................... SUCCESS [12.973s]
[INFO] Apache Hadoop Archives ............................ SUCCESS [3.265s]
[INFO] Apache Hadoop Rumen ............................... SUCCESS [11.060s]
[INFO] Apache Hadoop Gridmix ............................. SUCCESS [7.412s]
[INFO] Apache Hadoop Data Join ........................... SUCCESS [4.221s]
[INFO] Apache Hadoop Extras .............................. SUCCESS [4.771s]
[INFO] Apache Hadoop Pipes ............................... SUCCESS [0.032s]
[INFO] Apache Hadoop OpenStack support ................... SUCCESS [8.030s]
[INFO] Apache Hadoop Client .............................. SUCCESS [7.730s]
[INFO] Apache Hadoop Mini-Cluster ........................ SUCCESS [0.158s]
[INFO] Apache Hadoop Scheduler Load Simulator ............ SUCCESS [7.485s]
[INFO] Apache Hadoop Tools Dist .......................... SUCCESS [6.912s]
[INFO] Apache Hadoop Tools ............................... SUCCESS [0.029s]
[INFO] Apache Hadoop Distribution ........................ SUCCESS [40.425s]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 21:57.892s
[INFO] Finished at: Mon Apr 21 14:33:22 CST 2014
[INFO] Final Memory: 88M/243M
[INFO] ------------------------------------------------------------------------




4、怎么讲Hadoop Project 导入到Eclipse

Importing projects to eclipse

When you import the project to eclipse, install hadoop-maven-plugins at first.

  $ cd hadoop-maven-plugins

  $ mvn install

Then, generate eclipse project files.

  $ mvn eclipse:eclipse -DskipTests

At last, import to eclipse by specifying the root directory of the project via

[File] > [Import] > [Existing Projects into Workspace].

 

5MAVEN国内镜像配置
        1、进入安装目录 /opt/modules/apache-maven-1.0.5/conf,编辑 settings.xml 文件

* 修改<mirrors>内容:

       <mirror> 

              <id>nexus-osc</id> 

              <mirrorOf>*</mirrorOf> 

              <name>Nexus osc</name> 

              <url>http://maven.oschina.net/content/groups/public/</url> 

       </mirror>

* 修改<profiles>内容:

<profile> 

       <id>jdk-1.6</id> 

       <activation> 

              <jdk>1.6</jdk> 

       </activation> 

       <repositories> 

              <repository> 

                     <id>nexus</id> 

                     <name>local private nexus</name> 

                     <url>http://maven.oschina.net/content/groups/public/</url> 

                     <releases> 

                            <enabled>true</enabled> 

                     </releases> 

                     <snapshots> 

                            <enabled>false</enabled> 

                     </snapshots> 

              </repository> 

       </repositories>

       <pluginRepositories> 

              <pluginRepository> 

                     <id>nexus</id> 

                     <name>local private nexus</name> 

                     <url>http://maven.oschina.net/content/groups/public/</url> 

                     <releases> 

                            <enabled>true</enabled> 

                     </releases> 

                     <snapshots> 

                            <enabled>false</enabled> 

                     </snapshots> 

              </pluginRepository> 

       </pluginRepositories> 

</profile>

        2、复制配置

        将该配置文件复制到用户目录,使得每次对maven创建时,都采用该配置

* 查看用户目录【/home/hadoop】是否存在【.m2】文件夹,如没有,则创建

$ cd /home/hadoop

$ mkdir .m2

* 复制文件

$ cp /opt/modules/apache-maven-1.0.5/conf/settings.xml ~/.m2/

1.6、配置DNS

修改: vi /etc/resolv.conf     

nameserver 8.8.8.8

nameserver 8.8.4.4

 

 FAQ:

1) 报错: "[ERROR] Failed to execute goal org.codehaus.mojo:make-maven-plugin:1.0-beta-1:autoreconf (autoreconf) on project hadoop-yarn-server-nodemanager: autoreconf command returned an exit value != 0. Aborting build; see debug output for more information. -> [Help 1]"
[ERROR] Failed to execute goal org.codehaus.mojo:make-maven-plugin:1.0-beta-1:autoreconf (compile) on project hadoop-common: autoreconf command returned an exit value != 0. Aborting build; see debug output for more information. -> [Help 1]
这个是因为编译的时候带了 native 参数,但是没装autotool。Centos下。 
yum install autoconf
yum install automake
yum install libtool        <---这个里面有 autoreconf

还是不行就 -P-cbuild 编译吧,别用native了。
2) Build fails with "[ERROR] Failed to execute goal org.codehaus.mojo:exec-maven-plugin:1.2:exec (generate-sources) onproject hadoop-yarn-api: Command execution failed. Process exited with an error: 1(Exit value: 1) -> [Help 1]"
没装 protoc,见前面一篇文章,去Google下吧。
http://protobuf.googlecode.com/files/protobuf-2.4.1.tar.gz

[INFO] configure: error: Zlib headers were not found... native-hadoop library needs zlib to build. Please install the requisite zlib development package.
原因: 未安装zlibc
解决方法: 安装 sudo apt-get install --reinstall zlibc zlib1g zlib1g-dev
 3)Failed to execute goal org.codehaus.mojo:make-maven-plugin:1.0-beta-1:configure (compile) on project hadoop-common: ./configure returned an exit value != 0. Aborting build; see command output above for more information. -> [Help 1]

没装zlib
yum install zlib
yum install zlib-devel
[ERROR] Failed to execute goal org.codehaus.mojo:make-maven-plugin:1.0-beta-1:configure (compile) on project hadoop-common: ./configure returned an exit value != 0. Aborting build; see command output above for more information. -> [Help 1]
原因: configure: error: Native java headers not found. Is $JAVA_HOME set correctly?
解决方法: ubuntu已安装的为open jdk, 需要sun的jdk, 从虚拟机复制下来, 然后指定JAVA_HOME
  jdk的安装: 下载 http://www.oracle.com/technetwork/java/javase/downloads/index.html
                  chmod +x jdk-6u43-linux-i586.bin
                  ./jdk-6u43-linux-i586.bin

4)ERROR] Failed to execute goal org.apache.maven.plugins:maven-antrun-plugin:1.6:run (site) on project hadoop-common: An Ant BuildException has occured: Execute failed: java.io.IOException: Cannot run program "${env.FORREST_HOME}/bin/forrest" (in directory "/root/hadoop/release-0.23.0-rc1/hadoop-common-project/hadoop-common/target/docs-src"): java.io.IOException: error=2, No such file or directory -> [Help 1]
没装forrest.
Apache forrest.
http://forrest.apache.org/mirrors.cgi

安装并且设置FORREST_HOME 到profile里面。
.......

5)ERROR] Failed to execute goal org.apache.maven.plugins:maven-antrun-plugin:1.6:run (site) on project hadoop-common: An Ant BuildException has occured: stylesheet /root/hadoop/release-0.23.0-rc1/hadoop-common-project/hadoop-common/${env.FINDBUGS_HOME}/src/xsl/default.xsl doesn't exist. -> [Help 1]
没装findbug
http://findbugs.sourceforge.net/downloads.html

6centos此前在未安装 ant 及 没以sudo(普通用户) 执行 出错,如下:(之前记得把 maven java_home 环境变量写进.bash_profile,或、 /etc/profile )
[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary:
[INFO] 
[INFO] Apache Hadoop Main ................................ SUCCESS [07:39 min]
[INFO] Apache Hadoop Project POM ......................... SUCCESS [03:07 min]
[INFO] Apache Hadoop Annotations ......................... SUCCESS [01:53 min]
[INFO] Apache Hadoop Assemblies .......................... SUCCESS [  0.457 s]
[INFO] Apache Hadoop Project Dist POM .................... SUCCESS [08:52 min]
[INFO] Apache Hadoop Maven Plugins ....................... SUCCESS [ 41.301 s]
[INFO] Apache Hadoop MiniKDC ............................. SUCCESS [04:04 min]
[INFO] Apache Hadoop Auth ................................ SUCCESS [02:41 min]
[INFO] Apache Hadoop Auth Examples ....................... SUCCESS [ 13.874 s]
[INFO] Apache Hadoop Common .............................. FAILURE [07:28 min]
[INFO] Apache Hadoop NFS ................................. SKIPPED
[INFO] Apache Hadoop Common Project ...................... SKIPPED
[INFO] Apache Hadoop HDFS ................................ SKIPPED
[INFO] Apache Hadoop HttpFS .............................. SKIPPED
[INFO] Apache Hadoop HDFS BookKeeper Journal ............. SKIPPED
[INFO] Apache Hadoop HDFS-NFS ............................ SKIPPED
[INFO] Apache Hadoop HDFS Project ........................ SKIPPED
[INFO] hadoop-yarn ....................................... SKIPPED
[INFO] hadoop-yarn-api ................................... SKIPPED
[INFO] hadoop-yarn-common ................................ SKIPPED
[INFO] hadoop-yarn-server ................................ SKIPPED
[INFO] hadoop-yarn-server-common ......................... SKIPPED
[INFO] hadoop-yarn-server-nodemanager .................... SKIPPED
[INFO] hadoop-yarn-server-web-proxy ...................... SKIPPED
[INFO] hadoop-yarn-server-applicationhistoryservice ...... SKIPPED
[INFO] hadoop-yarn-server-resourcemanager ................ SKIPPED
[INFO] hadoop-yarn-server-tests .......................... SKIPPED
[INFO] hadoop-yarn-client ................................ SKIPPED
[INFO] hadoop-yarn-applications .......................... SKIPPED
[INFO] hadoop-yarn-applications-distributedshell ......... SKIPPED
[INFO] hadoop-yarn-applications-unmanaged-am-launcher .... SKIPPED
[INFO] hadoop-yarn-site .................................. SKIPPED
[INFO] hadoop-yarn-project ............................... SKIPPED
[INFO] hadoop-mapreduce-client ........................... SKIPPED
[INFO] hadoop-mapreduce-client-core ...................... SKIPPED
[INFO] hadoop-mapreduce-client-common .................... SKIPPED
[INFO] hadoop-mapreduce-client-shuffle ................... SKIPPED
[INFO] hadoop-mapreduce-client-app ....................... SKIPPED
[INFO] hadoop-mapreduce-client-hs ........................ SKIPPED
[INFO] hadoop-mapreduce-client-jobclient ................. SKIPPED
[INFO] hadoop-mapreduce-client-hs-plugins ................ SKIPPED
[INFO] Apache Hadoop MapReduce Examples .................. SKIPPED
[INFO] hadoop-mapreduce .................................. SKIPPED
[INFO] Apache Hadoop MapReduce Streaming ................. SKIPPED
[INFO] Apache Hadoop Distributed Copy .................... SKIPPED
[INFO] Apache Hadoop Archives ............................ SKIPPED
[INFO] Apache Hadoop Rumen ............................... SKIPPED
[INFO] Apache Hadoop Gridmix ............................. SKIPPED
[INFO] Apache Hadoop Data Join ........................... SKIPPED
[INFO] Apache Hadoop Extras .............................. SKIPPED
[INFO] Apache Hadoop Pipes ............................... SKIPPED
[INFO] Apache Hadoop OpenStack support ................... SKIPPED
[INFO] Apache Hadoop Client .............................. SKIPPED
[INFO] Apache Hadoop Mini-Cluster ........................ SKIPPED
[INFO] Apache Hadoop Scheduler Load Simulator ............ SKIPPED
[INFO] Apache Hadoop Tools Dist .......................... SKIPPED
[INFO] Apache Hadoop Tools ............................... SKIPPED
[INFO] Apache Hadoop Distribution ........................ SKIPPED
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 40:45 min
[INFO] Finished at: 2014-10-20T00:30:04-08:00
[INFO] Final Memory: 82M/239M
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-antrun-plugin:1.7:run (site) on project hadoop-common: An Ant BuildException has occured: stylesheet /home/grid/hadoop-2.4.1/release-2.4.1/hadoop-common-project/hadoop-common/${env.FINDBUGS_HOME}/src/xsl/default.xsl doesn't exist.
[ERROR] around Ant part ...<xslt style="${env.FINDBUGS_HOME}/src/xsl/default.xsl" in="/home/grid/hadoop-2.4.1/release-2.4.1/hadoop-common-project/hadoop-common/target/findbugsXml.xml" out="/home/grid/hadoop-2.4.1/release-2.4.1/hadoop-common-project/hadoop-common/target/site/findbugs.html"/>... @ 44:267 in /home/grid/hadoop-2.4.1/release-2.4.1/hadoop-common-project/hadoop-common/target/antrun/build-main.xml
[ERROR] -> [Help 1]
[ERROR] 
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR] 
[ERROR] For more information about the errors and possible solutions, please read the following articles:
[ERROR] [Help 1] http://cwiki.apache.org/confluen ... oExecutionException
[ERROR] 
[ERROR] After correcting the problems, you can resume the build with the command
[ERROR]   mvn <goals> -rf :hadoop-common
解决办法:
以root用户执行 mvn package -Pdist,native -DskipTests -Dtar ,完毕后再把文件所有者改为普通用户(我这为grid)


大半小时后编译完毕,经format  ,后启动dfs , yarn 进程正常!
原文地址:https://www.cnblogs.com/ilinuxer/p/5024513.html