安装与配置hadoop_lzo

步骤1:准备安装包
  wget http://www.apache.org/dist/ant/binaries/apache-ant-1.8.0RC1-bin.tar.gz
  wget https://github.com/toddlipcon/hadoop-lzo/archive/0.4.15.tar.gz

步骤2:安装ant和lzo
  //安装ant
  tar -xvf apache-ant-1.8.0RC1-bin.tar.gz
  mkdir /usr/local/ant/
  cp apache-ant-1.8.0RC1-bin/* /usr/local/ant/
  export ANT_HOME=/usr/local/ant
  export ANT=$ANT_HOME/bin
  export PATH=$ANT:$PATH
  ant -v
  //若”ant -v“输出内容,则表示安装成功
  //安装lzo
  tar -xvf lzo-2.06.tar.gz
  cd lzo-2.06
  ./configure --enable-shared --prefix=/usr/local/lzo-2.06
  make
  make install
  cp /usr/local/lzo-2.06/lib/* /usr/lib
  cp /usr/local/lzo-2.06/lib/* /usr/lib64

步骤3:编译hadoop-lzo.0.4.15
  tar -xvf 0.4.15.tar.gz
  cd hadoop-lzo-0.4.15
  C_INCLUDE_PATH=/usr/local/lzo-2.06/include
  LIBRARY_PATH=/usr/local/lzo-2.06/lib
  ant compile-native tar // 此命令同时会将hadoop-lzo生成一个单独目录(build/hadoop-lzo-0.4.15)并找包(build/hadoop-lzo-0.4.15.tar.gz)
  cp build/hadoop-lzo-0.4.15 -R /home/inoknok_hdp/ //[可选]为了方便,可以将生成的hadoop-lzo-0.4.15拷贝到独立的目录下。

步骤4:将hadoop-lzo配置到hadoop和hbase中
  cd build/hadoop-lzo-0.4.15
  cp hadoop-lzo-0.4.15.jar /home/inoknok_hdp/hadoop-1.0.4/lib/
  cp hadoop-lzo-0.4.15.jar /home/inoknok_hdp/hbase-0.94.13/lib/
  cp lib/native/Linux-amd64-64/ -R /home/inoknok_hdp/hbase-0.94.13/lib/native/
  cp lib/native/Linux-amd64-64/ -R /home/inoknok_hdp/hadoop-1.0.4/lib/native/

  vi core-site.xml
   <property>
    <name>io.compression.codecs</name>
    <value>org.apache.hadoop.io.compress.GzipCodec,org.apache.hadoop.io.compress.DefaultCodec,com.hadoop.compression.lzo.LzoCodec,com.hadoop.compression.lzo.LzopCodec,org.apache.hadoop.io.compress.BZip2Codec</value>
   </property>

  <property>
    <name>io.compression.codec.lzo.class</name>
    <value>com.hadoop.compression.lzo.LzoCodec</value>
  </property>

  vi mapred-site.xml
  <property>
    <name>mapreduce.map.output.compress</name>
    <value>true</value>
  </property>

  <property>
    <name>mapreduce.map.output.compress.codec</name>
    <value>com.hadoop.compression.lzo.LzoCodec</value>
  </property>

  <property>
    <name>mapred.child.env</name>
    <value>JAVA_LIBRARY_PATH=/home/inoknok_hdp/hadoop-1.0.4/lib/native/Linux-amd64-64/</value>
  </property>

步骤5:重启hbase和hadoop,即支持了lzo

步骤6:hbase shell测试
  create 'lzotest', {NAME=>'cf', COMPRESSION=>'lzo'}
  put 'lzotest', 'row-1', 'cf:col-1', 'val-1'
  put 'lzotest', 'row-2', 'cf:col-2', 'val-2'
  put 'lzotest', 'row-3', 'cf', 'val-3'
  put 'lzotest', 'row-4', 'cf:col-1', 'val-4'

  scan 'lzotest'
  ROW COLUMN+CELL
  row-1 column=cf:col-1, timestamp=1342424266301, value=val-1
  row-2 column=cf:col-2, timestamp=1342424275314, value=val-2
  row-3 column=cf:, timestamp=1342424286206, value=val-3
  row-4 column=cf:col-1, timestamp=1342424315516, value=val-4
  4 row(s) in 0.0750 seconds

参考链接:
  http://blog.csdn.net/inkfish/article/details/5194022
  https://code.google.com/a/apache-extras.org/p/hadoop-gpl-compression/wiki/FAQ
  http://my.oschina.net/orion/blog/144146
  http://ant.apache.org/manual/install.html
  http://share.blog.51cto.com/278008/549393
  https://github.com/toddlipcon/hadoop-lzo

遇到的问题:

问题1:可能要换ivy_repo_url对应的value
  <property name="ivy_repo_url" value="http://repo2.maven.org/maven2/org/apache/ivy/ivy/${ivy.version}/ivy-${ivy.version}.jar"/>

问题2:[javah] Error: Class org.apache.hadoop.conf.Configuration could not be found.
  vi build.xml
  //找到 javah ,增加 <classpath refid="classpath"/>
    <javah classpath="${build.classes}"
    destdir="${build.native}/src/com/hadoop/compression/lzo"
    force="yes"
    verbose="yes">
    <class name="com.hadoop.compression.lzo.LzoCompressor" />
    <class name="com.hadoop.compression.lzo.LzoDecompressor" />
    <classpath refid="classpath"/>
    </javah>

问题3:configure: error: lzo headers were not found...
  export C_INCLUDE_PATH=/usr/local/lzo-2.06/include
  export LIBRARY_PATH=/usr/local/lzo-2.06/lib

注意:直接用https://hadoop-gpl-compression.apache-extras.org.codespot.com/files/hadoop-gpl-compression-0.1.0-rc0.tar.gz会出现中途冲突问题,所以,建议重新编译。

原文地址:https://www.cnblogs.com/MarkGrid/p/3435916.html