PredictionIO+Universal Recommender快速开发部署推荐引擎的问题总结(2)

1, 对Universal Recommender进行pio build成功,但是却提示No engine found

Building and delpoying model
[INFO] [Engine$] Using command '/home/vagrant/pio_elastic1/PredictionIO-0.11.1-SNAPSHOT/sbt/sbt' at /home/vagrant/workspace/universal-recommender to build.
[INFO] [Engine$] If the path above is incorrect, this process will fail.
[INFO] [Engine$] Uber JAR disabled. Making sure lib/pio-assembly-0.11.1-SNAPSHOT.jar is absent.
[INFO] [Engine$] Going to run: /home/vagrant/pio_elastic1/PredictionIO-0.11.1-SNAPSHOT/sbt/sbt  package assemblyPackageDependency in /home/vagrant/workspace/universal-recommender
[INFO] [Engine$] Compilation finished successfully.
[INFO] [Engine$] Looking for an engine...
[ERROR] [Engine$] No engine found. Your build might have failed. Aborting.


这是Scala版本导致的问题。进入到universal-recommender的打包目录target中查看,会发现一个叫做scala-2.10的目录。

而我们的PredictionIO在make时指定版本是Scala2.11,所以会去scala-2.11目录下去寻找engine的jar包,自然会出现No engine found

这里有个临时的解决方案,就是直接把scala-2.10改名或者拷贝为scala-2.11,就可以让PredictionIO可以正常执行。

[vagrant@master universal-recommender]$ cd /home/vagrant/workspace/universal-recommender/target
[vagrant@master target]$ ls
resolution-cache  scala-2.10  streams
[vagrant@master target]$ cp -r scala-2.10 scala-2.11

2,解决Universal Recommender的Scala版本问题

上面的办法只是个临时解决办法,还是需要统一PredictionIO和Universal Recommender的Scala版本。

我们可以通过修改build.sbt来指定Universal Recommender的Scala版本

name := "universal-recommender"
                                  
scalaVersion := "2.11.8"         

但是,最终会发现出现编译错误。原因是Universal Recommender的一些依赖包,没有Scala2.11的版本。比如mahout包。

[vagrant@master universal-recommender]$ pio build
[INFO] [Engine$] Using command '/home/vagrant/pio_elastic1/PredictionIO-0.11.1-SNAPSHOT/sbt/sbt' at /home/vagrant/workspace/universal-recommender to build.
[INFO] [Engine$] If the path above is incorrect, this process will fail.
[INFO] [Engine$] Uber JAR disabled. Making sure lib/pio-assembly-0.11.1-SNAPSHOT.jar is absent.
[INFO] [Engine$] Going to run: /home/vagrant/pio_elastic1/PredictionIO-0.11.1-SNAPSHOT/sbt/sbt  package assemblyPackageDependency in /home/vagrant/workspace/universal-recommender
[ERROR] [Engine$] [error] (*:update) sbt.ResolveException: unresolved dependency: org.apache.mahout#mahout-math-scala_2.11;0.13.0: not found
[ERROR] [Engine$] [error] unresolved dependency: org.apache.mahout#mahout-spark_2.11;0.13.0: not found
[ERROR] [Engine$] [error] Total time: 69 s, completed Sep 8, 2017 10:06:41 AM
[ERROR] [Engine$] Return code of build command: /home/vagrant/pio_elastic1/PredictionIO-0.11.1-SNAPSHOT/sbt/sbt  package assemblyPackageDependency is 1. Aborting.

最终只好对build.sbt动了一下大手术,基本原则是:

 1),能够升级到Scala2.11的依赖包,升级;

 2),没有2.11的包,比如mahout,强制指定包版本为2.10

 3),依赖中出现2.10和2.11并存冲突的包,exclude掉2.10版本

最后修改的样子如下:

import scalariform.formatter.preferences._
import com.typesafe.sbt.SbtScalariform
import com.typesafe.sbt.SbtScalariform.ScalariformKeys
import sbt.Keys.scalaVersion

name := "universal-recommender"

version := "0.6.1-SNAPSHOT"

organization := "com.actionml"

scalaVersion := "2.11.8"

val mahoutVersion = "0.13.0"

val pioVersion = "0.11.0-incubating"

val elasticsearch1Version = "1.7.6"

//val elasticsearch5Version = "5.1.2"

libraryDependencies ++= Seq(
  "org.apache.predictionio" %% "apache-predictionio-core" % pioVersion % "provided",
  "org.apache.predictionio" %% "apache-predictionio-data-elasticsearch1" % pioVersion % "provided",
  "org.apache.spark" % "spark-core_2.11" % "2.1.0" % "provided",
  "org.apache.spark" % "spark-mllib_2.11" % "1.4.0" % "provided",
  "org.xerial.snappy" % "snappy-java" % "1.1.1.7",
  // Mahout's Spark libs
  "org.apache.mahout" % "mahout-math-scala_2.10" % mahoutVersion
    exclude("com.github.scopt", "scopt_2.10")
    exclude("org.spire-math", "spire_2.10")
    exclude("org.scalanlp", "breeze_2.10")
    exclude("org.spire-math", "spire-macros_2.10")
    exclude("org.apache.spark", "spark-mllib_2.10")
    exclude("org.json4s", "json4s-ast_2.10")
    exclude("org.json4s", "json4s-core_2.10")
    exclude("org.json4s", "json4s-native_2.10")
    exclude("org.scalanlp", "breeze-macros_2.10")
    exclude("com.esotericsoftware.kryo", "kryo")
    exclude("com.twitter", "chill_2.10"),
  "org.apache.mahout" % "mahout-spark_2.10" % mahoutVersion
    exclude("com.github.scopt", "scopt_2.10")
    exclude("org.spire-math", "spire_2.10")
    exclude("org.scalanlp", "breeze_2.10")
    exclude("org.spire-math", "spire-macros_2.10")
    exclude("org.apache.spark", "spark-mllib_2.10")
    exclude("org.json4s", "json4s-ast_2.10")
    exclude("org.json4s", "json4s-core_2.10")
    exclude("org.json4s", "json4s-native_2.10")
    exclude("com.twitter", "chill_2.10")
    exclude("org.scalanlp", "breeze-macros_2.10")
    exclude("com.esotericsoftware.kryo", "kryo")
    exclude("org.apache.spark", "spark-launcher_2.10")
    exclude("org.apache.spark", "spark-unsafe_2.10")
    exclude("org.apache.spark", "spark-tags_2.10")
    exclude("org.apache.spark", "spark-core_2.10")
    exclude("org.apache.spark", "spark-network-common_2.10")
    exclude("org.apache.spark", "spark-streaming_2.10")
    exclude("org.apache.spark", "spark-graphx_2.10")
    exclude("org.apache.spark", "spark-catalyst_2.10")
    exclude("org.apache.spark", "spark-sql_2.10"),
  "org.apache.mahout"  % "mahout-math" % mahoutVersion,
  "org.apache.mahout"  % "mahout-hdfs" % mahoutVersion
    exclude("com.thoughtworks.xstream", "xstream")
    exclude("org.apache.hadoop", "hadoop-client"),
  //"org.apache.hbase"        % "hbase-client"   % "0.98.5-hadoop2" % "provided",
  //  exclude("org.apache.zookeeper", "zookeeper"),
  // other external libs
  "com.thoughtworks.xstream" % "xstream" % "1.4.4"
    exclude("xmlpull", "xmlpull"),
  // possible build for es5 
  //"org.elasticsearch"       %% "elasticsearch-spark-13" % elasticsearch5Version % "provided",
  "org.elasticsearch" % "elasticsearch" % "1.7.5" % "provided",
  "org.elasticsearch" % "elasticsearch-spark-20_2.11" % "5.5.1",
//    exclude("org.apache.spark", "spark-launcher_2.11")
//    exclude("org.apache.spark", "spark-unsafe_2.11")
//    exclude("org.apache.spark", "spark-tags_2.11")
//    exclude("org.apache.spark", "spark-core_2.11")
//    exclude("org.apache.spark", "spark-network-common_2.11")
//    exclude("org.apache.spark", "spark-streaming_2.11")
//    exclude("org.apache.spark", "spark-catalyst_2.11")
//    exclude("org.apache.spark", "spark-sql_2.11"),
  "org.json4s" % "json4s-native_2.11" % "3.2.10")
  .map(_.exclude("org.apache.lucene","lucene-core")).map(_.exclude("org.apache.lucene","lucene-analyzers-common"))

resolvers += Resolver.mavenLocal

SbtScalariform.scalariformSettings

ScalariformKeys.preferences := ScalariformKeys.preferences.value
  .setPreference(AlignSingleLineCaseStatements, true)
  .setPreference(DoubleIndentClassDeclaration, true)
  .setPreference(DanglingCloseParenthesis, Prevent)
  .setPreference(MultilineScaladocCommentsStartOnFirstLine, true)

assemblyMergeStrategy in assembly := {
  case "plugin.properties" => MergeStrategy.discard
  case PathList(ps @ _*) if ps.last endsWith "package-info.class" =>
    MergeStrategy.first
  case PathList(ps @ _*) if ps.last endsWith "UnusedStubClass.class" =>
    MergeStrategy.first
  case x =>
    val oldStrategy = (assemblyMergeStrategy in assembly).value
    oldStrategy(x)
}

PredictionIO和Universal Recommender这样的开源产品,确实存在着官方文档不太完整或者更新不太及时的问题,按照官方手册一次成功的概率很低,需要多次的试验和调查,从其官网,邮件组,以及其他互联网渠道搜索各种线索,才能最终解决问题。

但PredictionIO的社区活跃度很好,Universal Recommender的开发者本人是PredictionIO的重要开发者,还对自己的产品有运营的意愿和行动,邮件组中的技术支持比较到位。

原文地址:https://www.cnblogs.com/csliwei/p/8093083.html