datax+hadoop2.X兼容性调试

以hdfsreader到hdfswriter为例进行说明:

1.datax的任务配置文件里需要指明使用的hadoop的配置文件,在datax+hadoop1.X的时候,可以直接使用hadoop1.X/conf/core-site.xml;

但是当要datax+hadoop2.X的时候,就需要将hadoop2.X/etc/core-site.xml和hadoop2.X/etc/hdfs-site.xml合成一个文件,同时可以命名为hadoop-site.xml.

2.在合成的hadoop-site.xml文件中,需要新增属性:

  <property>
    <name>fs.hdfs.impl</name>  <!--hdfsreader/hdfswriter的dir是hdfs://时需要增加,表示hdfs路径-->
    <value>org.apache.hadoop.hdfs.DistributedFileSystem</value>
  </property>

  <property>
    <name>fs.file.impl</name>  <!--hdfsreader/hdfswriter的dir是file://时需要增加,表示本地路径-->
    <value>org.apache.hadoop.fs.LocalFileSystem</value>
  </property>

3.针对hdfsreader中需要增加一些依赖包,包括:

  -rw-r--r-- 1 hadoop hadoop 575389 Dec 18 16:24 commons-collections-3.2.1.jar

  -rw-r--r-- 1 hadoop hadoop 62050 Dec 18 16:23 commons-logging-1.1.3.jar

  -rw-r--r-- 1 hadoop hadoop 1648200 Dec 18 16:25 guava-11.0.2.jar
  -rw-r--r-- 1 hadoop hadoop 3318401 Dec 18 16:26 hadoop-common-2.6.2.jar
  -rw-r--r-- 1 hadoop hadoop 178199 Dec 18 16:26 hadoop-lzo-0.4.20-SNAPSHOT.jar
  -rw-r--r-- 1 hadoop hadoop 16380 Dec 18 15:29 hdfsreader-1.0.0.jar
  -rw-r--r-- 1 hadoop hadoop 18490 Dec 18 15:29 java-xmlbuilder-0.4.jar
  -rw-r--r-- 1 hadoop hadoop 2019 Dec 18 15:29 ParamKey.java
  -rwxr-xr-x 1 hadoop hadoop 18837 Dec 18 15:29 plugins-common-1.0.0.jar

  需要把hdfsread/hadoop-0.19.2-core.jar(hadoop*-core*.jar)删除。

4.针对hdfswriter中需要增加一些依赖包,包括:

  -rwxr-xr-x 1 hadoop hadoop 41123 Dec 18 16:40 commons-cli-1.2.jar

  -rw-r--r-- 1 hadoop hadoop 575389 Dec 18 16:34 commons-collections-3.2.1.jar
  -rw-r--r-- 1 hadoop hadoop 62050 Dec 18 16:34 commons-logging-1.1.3.jar
  -rw-r--r-- 1 hadoop hadoop 1648200 Dec 18 16:34 guava-11.0.2.jar
  -rwxr-xr-x 1 hadoop hadoop 67190 Dec 18 16:40 hadoop-auth-2.6.2.jar
  -rw-r--r-- 1 hadoop hadoop 3318401 Dec 18 16:34 hadoop-common-2.6.2.jar
  -rwxr-xr-x 1 hadoop hadoop 7915385 Dec 18 16:36 hadoop-hdfs-2.6.2.jar
  -rw-r--r-- 1 hadoop hadoop 178199 Dec 18 16:34 hadoop-lzo-0.4.20-SNAPSHOT.jar
  -rw-r--r-- 1 hadoop hadoop 14652 Dec 18 16:35 hdfswriter-1.0.0.jar
  -rwxr-xr-x 1 hadoop hadoop 31212 Dec 18 16:43 htrace-core-3.0.4.jar
  -rw-r--r-- 1 hadoop hadoop 18490 Dec 18 16:34 java-xmlbuilder-0.4.jar
  -rw-r--r-- 1 hadoop hadoop 657766 Dec 18 15:28 libhadoop.so
  -rw-r--r-- 1 hadoop hadoop 4374 Dec 18 15:28 ParamKey.java
  -rwxr-xr-x 1 hadoop hadoop 18837 Dec 18 16:34 plugins-common-1.0.0.jar
  -rwxr-xr-x 1 hadoop hadoop 533455 Dec 18 16:43 protobuf-java-2.5.0.jar

  需要把hdfsread/hadoop-0.19.2-core.jar(hadoop*-core*.jar)删除。

5.环境变量务必配置正确,比如:

  PATH=$PATH:$HOME/app/bin.    //错误,这种错误难以发现,且容易引发问题

  PATH=$PATH:$HOME/app/bin:.:  //正确,当前目录要单独用:隔开

原文地址:https://www.cnblogs.com/coderxiaocai/p/5057822.html