spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matei Zaharia <matei.zaha...@gmail.com>
Subject Re: help on SparkContext.sequenceFile()
Date Sat, 19 Oct 2013 05:46:58 GMT
You need to do import SparkContext._ at the top of your program.

Matei

On Oct 18, 2013, at 5:56 PM, Shay Seng <shay@1618labs.com> wrote:

> I seem to be having issues compiling it though...
>  def readAsRdd[T: ClassManifest](sc: org.apache.spark.SparkContext, uri:String, clazz:java.lang.Class[_])
= {
>     val rdd = sc.sequenceFile[org.apache.hadoop.io.Text, org.apache.hadoop.io.BytesWritable](uri)
>     rdd.map(l=>{
>       val sz = l._2.getLength
>       val b = l._2.getBytes.slice(0,sz)
>       val parseFrom = clazz.getMethod("parseFrom",Class.forName("[B"))
>       parseFrom.invoke(null,b).asInstanceOf[T]
>     })
>   }
> 
> [ERROR] /Users/shay/sb/experiment/sps-emr/ue/src/main/scala/ue/proto.scala:57: error:
could not find implicit value for parameter kcf: () => org.apache.spark.WritableConverter[org.apache.hadoop.io.Text]
> [ERROR] Error occurred in an application involving default arguments.
> [INFO]     val rdd = sc.sequenceFile[org.apache.hadoop.io.Text, org.apache.hadoop.io.BytesWritable](uri)
> 
> 
> 
> On Fri, Oct 18, 2013 at 9:37 AM, Matei Zaharia <matei.zaharia@gmail.com> wrote:
> Don't worry about the implicit params, those are filled in by the compiler. All you need
to do is provide a key and value type, and a path. Look at how sequenceFile gets used in this
test:
> 
> https://git-wip-us.apache.org/repos/asf?p=incubator-spark.git;a=blob;f=core/src/test/scala/spark/FileSuite.scala;hb=af3c9d50
> 
> In particular, the K and V in Spark can be any Writable class, *or* primitive types like
Int, Double, etc, or String. For the latter ones, Spark automatically uses the correct Hadoop
Writable (e.g. IntWritable, DoubleWritable, Text).
> 
> Matei
> 
> 
> 
> On Oct 17, 2013, at 5:35 PM, Shay Seng <shay@1618labs.com> wrote:
> 
>> Hey gurus,
>> 
>> I'm having a little trouble deciphering the docs for 
>> 
>> sequenceFile[K, V](path: String, minSplits: Int = defaultMinSplits)(implicit km:
ClassManifest[K], vm: ClassManifest[V], kcf: () ⇒WritableConverter[K], vcf: () ⇒ WritableConverter[V]):
RDD[(K, V)]
>> 
>> Does anyone have a short example snippet?
>> 
>> tks
>> shay
>> 
>> 
> 
> 


Mime
View raw message