spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Anahita Talebi <anahita.t.am...@gmail.com>
Subject Re: Upgrade the scala code using the most updated Spark version
Date Tue, 28 Mar 2017 20:10:13 GMT
Hi,

Thanks for your answer.

I first changed the scala version to 2.11.8 and kept the spark version
1.5.2 (old version). Then I changed the scalatest version into "3.0.1".
With this configuration, I could run the code and compile it and generate
the .jar file.

When I changed the spark version into 2.1.0, I get the same error as
before. So I imagine the problem should be somehow related to the version
of spark.

Cheers,
Anahita

--------------------------------------------------------------------------------------------------------------------------------------------------------
import AssemblyKeys._

assemblySettings

name := "proxcocoa"

version := "0.1"

organization := "edu.berkeley.cs.amplab"

scalaVersion := "2.11.8"

parallelExecution in Test := false

{
  val excludeHadoop = ExclusionRule(organization = "org.apache.hadoop")
  libraryDependencies ++= Seq(
    "org.slf4j" % "slf4j-api" % "1.7.2",
    "org.slf4j" % "slf4j-log4j12" % "1.7.2",
    "org.scalatest" %% "scalatest" % "3.0.1" % "test",
    "org.apache.spark" %% "spark-core" % "2.1.0" excludeAll(excludeHadoop),
    "org.apache.spark" %% "spark-mllib" % "2.1.0" excludeAll(excludeHadoop),
    "org.apache.spark" %% "spark-sql" % "2.1.0" excludeAll(excludeHadoop),
    "org.apache.commons" % "commons-compress" % "1.7",
    "commons-io" % "commons-io" % "2.4",
    "org.scalanlp" % "breeze_2.11" % "0.11.2",
    "com.github.fommil.netlib" % "all" % "1.1.2" pomOnly(),
    "com.github.scopt" %% "scopt" % "3.3.0"
  )
}

{
  val defaultHadoopVersion = "1.0.4"
  val hadoopVersion =
    scala.util.Properties.envOrElse("SPARK_HADOOP_VERSION",
defaultHadoopVersion)
  libraryDependencies += "org.apache.hadoop" % "hadoop-client" %
hadoopVersion
}

libraryDependencies += "org.apache.spark" %% "spark-streaming" % "2.1.0"

resolvers ++= Seq(
  "Local Maven Repository" at Path.userHome.asFile.toURI.toURL +
".m2/repository",
  "Typesafe" at "http://repo.typesafe.com/typesafe/releases",
  "Spray" at "http://repo.spray.cc"
)

mergeStrategy in assembly <<= (mergeStrategy in assembly) { (old) =>
  {
    case PathList("javax", "servlet", xs @ _*)           =>
MergeStrategy.first
    case PathList(ps @ _*) if ps.last endsWith ".html"   =>
MergeStrategy.first
    case "application.conf"                              =>
MergeStrategy.concat
    case "reference.conf"                                =>
MergeStrategy.concat
    case "log4j.properties"                              =>
MergeStrategy.discard
    case m if m.toLowerCase.endsWith("manifest.mf")      =>
MergeStrategy.discard
    case m if m.toLowerCase.matches("meta-inf.*\\.sf$")  =>
MergeStrategy.discard
    case _ => MergeStrategy.first
  }
}

test in assembly := {}
--------------------------------------------------------------------------------------------------------------------------------------------------------

On Tue, Mar 28, 2017 at 9:33 PM, Marco Mistroni <mmistroni@gmail.com> wrote:

> Hello
>  that looks to me like there's something dodgy withyour Scala installation
> Though Spark 2.0 is built on Scala 2.11, it still support 2.10... i
> suggest you change one thing at a time in your sbt
> First Spark version. run it and see if it works
> Then amend the scala version
>
> hth
>  marco
>
> On Tue, Mar 28, 2017 at 5:20 PM, Anahita Talebi <anahita.t.amiri@gmail.com
> > wrote:
>
>> Hello,
>>
>> Thanks you all for your informative answers.
>> I actually changed the scala version to the 2.11.8 and spark version into
>> 2.1.0 in the build.sbt
>>
>> Except for these two guys (scala and spark version), I kept the same
>> values for the rest in the build.sbt file.
>> ------------------------------------------------------------
>> ---------------
>> import AssemblyKeys._
>>
>> assemblySettings
>>
>> name := "proxcocoa"
>>
>> version := "0.1"
>>
>> scalaVersion := "2.11.8"
>>
>> parallelExecution in Test := false
>>
>> {
>>   val excludeHadoop = ExclusionRule(organization = "org.apache.hadoop")
>>   libraryDependencies ++= Seq(
>>     "org.slf4j" % "slf4j-api" % "1.7.2",
>>     "org.slf4j" % "slf4j-log4j12" % "1.7.2",
>>     "org.scalatest" %% "scalatest" % "1.9.1" % "test",
>>     "org.apache.spark" % "spark-core_2.11" % "2.1.0"
>> excludeAll(excludeHadoop),
>>     "org.apache.spark" % "spark-mllib_2.11" % "2.1.0"
>> excludeAll(excludeHadoop),
>>     "org.apache.spark" % "spark-sql_2.11" % "2.1.0"
>> excludeAll(excludeHadoop),
>>     "org.apache.commons" % "commons-compress" % "1.7",
>>     "commons-io" % "commons-io" % "2.4",
>>     "org.scalanlp" % "breeze_2.11" % "0.11.2",
>>     "com.github.fommil.netlib" % "all" % "1.1.2" pomOnly(),
>>     "com.github.scopt" %% "scopt" % "3.3.0"
>>   )
>> }
>>
>> {
>>   val defaultHadoopVersion = "1.0.4"
>>   val hadoopVersion =
>>     scala.util.Properties.envOrElse("SPARK_HADOOP_VERSION",
>> defaultHadoopVersion)
>>   libraryDependencies += "org.apache.hadoop" % "hadoop-client" %
>> hadoopVersion
>> }
>>
>> libraryDependencies += "org.apache.spark" % "spark-streaming_2.11" %
>> "2.1.0"
>>
>> resolvers ++= Seq(
>>   "Local Maven Repository" at Path.userHome.asFile.toURI.toURL +
>> ".m2/repository",
>>   "Typesafe" at "http://repo.typesafe.com/typesafe/releases",
>>   "Spray" at "http://repo.spray.cc"
>> )
>>
>> mergeStrategy in assembly <<= (mergeStrategy in assembly) { (old) =>
>>   {
>>     case PathList("javax", "servlet", xs @ _*)           =>
>> MergeStrategy.first
>>     case PathList(ps @ _*) if ps.last endsWith ".html"   =>
>> MergeStrategy.first
>>     case "application.conf"                              =>
>> MergeStrategy.concat
>>     case "reference.conf"                                =>
>> MergeStrategy.concat
>>     case "log4j.properties"                              =>
>> MergeStrategy.discard
>>     case m if m.toLowerCase.endsWith("manifest.mf")      =>
>> MergeStrategy.discard
>>     case m if m.toLowerCase.matches("meta-inf.*\\.sf$")  =>
>> MergeStrategy.discard
>>     case _ => MergeStrategy.first
>>   }
>> }
>>
>> test in assembly := {}
>> ----------------------------------------------------------------
>>
>> When I compile the code, I get the following error:
>>
>> [info] Compiling 4 Scala sources to /Users/atalebi/Desktop/new_ver
>> sion_proxcocoa-master/target/scala-2.11/classes...
>> [error] /Users/atalebi/Desktop/new_version_proxcocoa-master/src/main
>> /scala/utils/OptUtils.scala:40: value mapPartitionsWithSplit is not a
>> member of org.apache.spark.rdd.RDD[String]
>> [error]     val sizes = data.mapPartitionsWithSplit{ case(i,lines) =>
>> [error]                      ^
>> [error] /Users/atalebi/Desktop/new_version_proxcocoa-master/src/main
>> /scala/utils/OptUtils.scala:41: value length is not a member of Any
>> [error]       Iterator(i -> lines.length)
>> [error]                           ^
>> ----------------------------------------------------------------
>> It gets the error in the code. Does it mean that for the different
>> version of the spark and scala, I need to change the main code?
>>
>> Thanks,
>> Anahita
>>
>>
>>
>>
>>
>>
>> On Tue, Mar 28, 2017 at 10:28 AM, Dinko Srkoč <dinko.srkoc@gmail.com>
>> wrote:
>>
>>> Adding to advices given by others ... Spark 2.1.0 works with Scala 2.11,
>>> so set:
>>>
>>>   scalaVersion := "2.11.8"
>>>
>>> When you see something like:
>>>
>>>   "org.apache.spark" % "spark-core_2.10" % "1.5.2"
>>>
>>> that means that library `spark-core` is compiled against Scala 2.10,
>>> so you would have to change that to 2.11:
>>>
>>>   "org.apache.spark" % "spark-core_2.11" % "2.1.0"
>>>
>>> better yet, let SBT worry about libraries built against particular
>>> Scala versions:
>>>
>>>   "org.apache.spark" %% "spark-core" % "2.1.0"
>>>
>>> The `%%` will instruct SBT to choose the library appropriate for a
>>> version of Scala that is set in `scalaVersion`.
>>>
>>> It may be worth mentioning that the `%%` thing works only with Scala
>>> libraries as they are compiled against a certain Scala version. Java
>>> libraries are unaffected (have nothing to do with Scala), e.g. for
>>> `slf4j` one only uses single `%`s:
>>>
>>>   "org.slf4j" % "slf4j-api" % "1.7.2"
>>>
>>> Cheers,
>>> Dinko
>>>
>>> On 27 March 2017 at 23:30, Mich Talebzadeh <mich.talebzadeh@gmail.com>
>>> wrote:
>>> > check these versions
>>> >
>>> > function create_build_sbt_file {
>>> >         BUILD_SBT_FILE=${GEN_APPSDIR}/scala/${APPLICATION}/build.sbt
>>> >         [ -f ${BUILD_SBT_FILE} ] && rm -f ${BUILD_SBT_FILE}
>>> >         cat >> $BUILD_SBT_FILE << !
>>> > lazy val root = (project in file(".")).
>>> >   settings(
>>> >     name := "${APPLICATION}",
>>> >     version := "1.0",
>>> >     scalaVersion := "2.11.8",
>>> >     mainClass in Compile := Some("myPackage.${APPLICATION}")
>>> >   )
>>> > libraryDependencies += "org.apache.spark" %% "spark-core" % "2.0.0" %
>>> > "provided"
>>> > libraryDependencies += "org.apache.spark" %% "spark-sql" % "2.0.0" %
>>> > "provided"
>>> > libraryDependencies += "org.apache.spark" %% "spark-hive" % "2.0.0" %
>>> > "provided"
>>> > libraryDependencies += "org.apache.spark" %% "spark-streaming" %
>>> "2.0.0" %
>>> > "provided"
>>> > libraryDependencies += "org.apache.spark" %% "spark-streaming-kafka" %
>>> > "1.6.1" % "provided"
>>> > libraryDependencies += "com.google.code.gson" % "gson" % "2.6.2"
>>> > libraryDependencies += "org.apache.phoenix" % "phoenix-spark" %
>>> > "4.6.0-HBase-1.0"
>>> > libraryDependencies += "org.apache.hbase" % "hbase" % "1.2.3"
>>> > libraryDependencies += "org.apache.hbase" % "hbase-client" % "1.2.3"
>>> > libraryDependencies += "org.apache.hbase" % "hbase-common" % "1.2.3"
>>> > libraryDependencies += "org.apache.hbase" % "hbase-server" % "1.2.3"
>>> > // META-INF discarding
>>> > mergeStrategy in assembly <<= (mergeStrategy in assembly) { (old)
=>
>>> >    {
>>> >     case PathList("META-INF", xs @ _*) => MergeStrategy.discard
>>> >     case x => MergeStrategy.first
>>> >    }
>>> > }
>>> > !
>>> > }
>>> >
>>> > HTH
>>> >
>>> > Dr Mich Talebzadeh
>>> >
>>> >
>>> >
>>> > LinkedIn
>>> > https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJ
>>> d6zP6AcPCCdOABUrV8Pw
>>> >
>>> >
>>> >
>>> > http://talebzadehmich.wordpress.com
>>> >
>>> >
>>> > Disclaimer: Use it at your own risk. Any and all responsibility for any
>>> > loss, damage or destruction of data or any other property which may
>>> arise
>>> > from relying on this email's technical content is explicitly
>>> disclaimed. The
>>> > author will in no case be liable for any monetary damages arising from
>>> such
>>> > loss, damage or destruction.
>>> >
>>> >
>>> >
>>> >
>>> > On 27 March 2017 at 21:45, Jörn Franke <jornfranke@gmail.com> wrote:
>>> >>
>>> >> Usually you define the dependencies to the Spark library as provided.
>>> You
>>> >> also seem to mix different Spark versions which should be avoided.
>>> >> The Hadoop library seems to be outdated and should also only be
>>> provided.
>>> >>
>>> >> The other dependencies you could assemble in a fat jar.
>>> >>
>>> >> On 27 Mar 2017, at 21:25, Anahita Talebi <anahita.t.amiri@gmail.com>
>>> >> wrote:
>>> >>
>>> >> Hi friends,
>>> >>
>>> >> I have a code which is written in Scala. The scala version 2.10.4 and
>>> >> Spark version 1.5.2 are used to run the code.
>>> >>
>>> >> I would like to upgrade the code to the most updated version of spark,
>>> >> meaning 2.1.0.
>>> >>
>>> >> Here is the build.sbt:
>>> >>
>>> >> import AssemblyKeys._
>>> >>
>>> >> assemblySettings
>>> >>
>>> >> name := "proxcocoa"
>>> >>
>>> >> version := "0.1"
>>> >>
>>> >> scalaVersion := "2.10.4"
>>> >>
>>> >> parallelExecution in Test := false
>>> >>
>>> >> {
>>> >>   val excludeHadoop = ExclusionRule(organization =
>>> "org.apache.hadoop")
>>> >>   libraryDependencies ++= Seq(
>>> >>     "org.slf4j" % "slf4j-api" % "1.7.2",
>>> >>     "org.slf4j" % "slf4j-log4j12" % "1.7.2",
>>> >>     "org.scalatest" %% "scalatest" % "1.9.1" % "test",
>>> >>     "org.apache.spark" % "spark-core_2.10" % "1.5.2"
>>> >> excludeAll(excludeHadoop),
>>> >>     "org.apache.spark" % "spark-mllib_2.10" % "1.5.2"
>>> >> excludeAll(excludeHadoop),
>>> >>     "org.apache.spark" % "spark-sql_2.10" % "1.5.2"
>>> >> excludeAll(excludeHadoop),
>>> >>     "org.apache.commons" % "commons-compress" % "1.7",
>>> >>     "commons-io" % "commons-io" % "2.4",
>>> >>     "org.scalanlp" % "breeze_2.10" % "0.11.2",
>>> >>     "com.github.fommil.netlib" % "all" % "1.1.2" pomOnly(),
>>> >>     "com.github.scopt" %% "scopt" % "3.3.0"
>>> >>   )
>>> >> }
>>> >>
>>> >> {
>>> >>   val defaultHadoopVersion = "1.0.4"
>>> >>   val hadoopVersion =
>>> >>     scala.util.Properties.envOrElse("SPARK_HADOOP_VERSION",
>>> >> defaultHadoopVersion)
>>> >>   libraryDependencies += "org.apache.hadoop" % "hadoop-client" %
>>> >> hadoopVersion
>>> >> }
>>> >>
>>> >> libraryDependencies += "org.apache.spark" % "spark-streaming_2.10" %
>>> >> "1.5.0"
>>> >>
>>> >> resolvers ++= Seq(
>>> >>   "Local Maven Repository" at Path.userHome.asFile.toURI.toURL +
>>> >> ".m2/repository",
>>> >>   "Typesafe" at "http://repo.typesafe.com/typesafe/releases",
>>> >>   "Spray" at "http://repo.spray.cc"
>>> >> )
>>> >>
>>> >> mergeStrategy in assembly <<= (mergeStrategy in assembly) { (old)
=>
>>> >>   {
>>> >>     case PathList("javax", "servlet", xs @ _*)           =>
>>> >> MergeStrategy.first
>>> >>     case PathList(ps @ _*) if ps.last endsWith ".html"   =>
>>> >> MergeStrategy.first
>>> >>     case "application.conf"                              =>
>>> >> MergeStrategy.concat
>>> >>     case "reference.conf"                                =>
>>> >> MergeStrategy.concat
>>> >>     case "log4j.properties"                              =>
>>> >> MergeStrategy.discard
>>> >>     case m if m.toLowerCase.endsWith("manifest.mf")      =>
>>> >> MergeStrategy.discard
>>> >>     case m if m.toLowerCase.matches("meta-inf.*\\.sf$")  =>
>>> >> MergeStrategy.discard
>>> >>     case _ => MergeStrategy.first
>>> >>   }
>>> >> }
>>> >>
>>> >> test in assembly := {}
>>> >>
>>> >> -----------------------------------------------------------
>>> >> I downloaded the spark 2.1.0 and change the version of spark and
>>> >> scalaversion in the build.sbt. But unfortunately, I was failed to run
>>> the
>>> >> code.
>>> >>
>>> >> Does anybody know how I can upgrade the code to the most recent spark
>>> >> version by changing the build.sbt file?
>>> >>
>>> >> Or do you have any other suggestion?
>>> >>
>>> >> Thanks a lot,
>>> >> Anahita
>>> >>
>>> >
>>>
>>
>>
>

Mime
View raw message