spark编译

安装jdk1.8

vi pom.xml

CDH版本编译添加:

pom.xml
<repositories>
  <repository>
    <id>central</id>
    <!-- This should be at top, it makes maven try the central repo first and then others and hence faster dep resolution -->
    <name>Maven Repository</name>
    <releases>
      <enabled>true</enabled>
    </releases>
    <snapshots>
      <enabled>false</enabled>
    </snapshots>
  </repository>
  <repository>
    <id>cloudera</id>
  </repository>
</repositories>
<pluginRepositories>
  <pluginRepository>
    <id>central</id>
    <releases>
      <enabled>true</enabled>
    </releases>
    <snapshots>
      <enabled>false</enabled>
    </snapshots>
  </pluginRepository>
</pluginRepositories>
<dependencies>

./dev/make-distribution.sh (自定义版本)

VERSION=2.3.1

#VERSION=$("$MVN" help:evaluate -Dexpression=project.version $@ 2>/dev/null | grep -v "INFO" | tail -n 1)

#SCALA_VERSION=$("$MVN" help:evaluate -Dexpression=scala.binary.version $@ 2>/dev/null\

 #   | grep -v "INFO"\

  #  | tail -n 1)

SCALA_VERSION=2.11

SPARK_HADOOP_VERSION=2.6.0

#SPARK_HADOOP_VERSION=$("$MVN" help:evaluate -Dexpression=hadoop.version $@ 2>/dev/null\

#    | grep -v "INFO"\

#    | tail -n 1)

SPARK_HIVE=1

#SPARK_HIVE=$("$MVN" help:evaluate -Dexpression=project.activeProfiles -pl sql/hive $@ 2>/dev/null\

 #   | grep -v "INFO"\

#    | fgrep --count "<id>hive</id>";\

    # Reset exit status to 0, otherwise the script stops here if the last grep finds nothing\

    # because we use "set -o pipefail"

 #   echo -n)

执行编译命令:

./dev/make-distribution.sh --name 2.6.0-cdh5.10.0 --tgz  -Pyarn -Phadoop-2.6  -Dhadoop.version=2.6.0-cdh5.10.0 -Phive -Phive-thriftserver -DskipTests

原文地址:https://www.cnblogs.com/moonypog/p/10251018.html