spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Philip Ogren <philip.og...@oracle.com>
Subject Re: Where is reduceByKey?
Date Thu, 07 Nov 2013 21:05:50 GMT
Thanks!  That answers my question and solves my compile problem.


On 11/7/2013 2:01 PM, Josh Rosen wrote:
> The additional methods on RDDs of pairs are defined in a class called 
> PairRDDFunctions 
> (https://spark.incubator.apache.org/docs/latest/api/core/index.html#org.apache.spark.rdd.PairRDDFunctions).

>  SparkContext provides an implicit conversion from RDD[T] to 
> PairRDDFunctions[T] to make this transparent to users.
>
> To import those implicit conversions, use
>
>     import org.apache.spark.SparkContext._
>
>
> These conversions are automatically imported by Spark Shell, but 
> you'll have to import them yourself in standalone programs.
>
>
> On Thu, Nov 7, 2013 at 11:54 AM, Philip Ogren <philip.ogren@oracle.com 
> <mailto:philip.ogren@oracle.com>> wrote:
>
>     On the front page <http://spark.incubator.apache.org/> of the
>     Spark website there is the following simple word count implementation:
>
>     file = spark.textFile("hdfs://...")
>     file.flatMap(line => line.split(" ")).map(word => (word,
>     1)).reduceByKey(_ + _)
>
>     The same code can be found in the Quick Start
>     <http://spark.incubator.apache.org/docs/latest/quick-start.html>
>     quide.  When I follow the steps in my spark-shell (version 0.8.0)
>     it works fine.  The reduceByKey method is also shown in the list
>     of transformations
>     <http://spark.incubator.apache.org/docs/latest/scala-programming-guide.html#transformations>
>     in the Spark Programming Guide.  The bottom of this list directs
>     the reader to the API docs for the class RDD (this link is broken,
>     BTW).  The API docs for RDD
>     <http://spark.incubator.apache.org/docs/latest/api/core/index.html#org.apache.spark.rdd.RDD>
>     does not list a reduceByKey method for RDD.  Also, when I try to
>     compile the above code in a Scala class definition I get the
>     following compile error:
>
>     value reduceByKey is not a member of
>     org.apache.spark.rdd.RDD[(java.lang.String, Int)]
>
>     I am compiling with maven using the following dependency definition:
>
>             <dependency>
>     <groupId>org.apache.spark</groupId>
>     <artifactId>spark-core_2.9.3</artifactId>
>     <version>0.8.0-incubating</version>
>             </dependency>
>
>     Can someone help me understand why this code works fine from the
>     spark-shell but doesn't seem to exist in the API docs and won't
>     compile?
>
>     Thanks,
>     Philip
>
>


Mime
View raw message