spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Armbrust <mich...@databricks.com>
Subject Re: Implicit conversion RDD -> SchemaRDD
Date Thu, 02 Oct 2014 18:04:34 GMT
You need to define the case class outside of your method.  Otherwise the
scala compiler implicitly adds a pointer to the containing scope to your
class which confuses things.

On Thu, Oct 2, 2014 at 7:20 AM, Stephen Boesch <javadba@gmail.com> wrote:

>
> Here is the specific code
>
>
>     val sc = new SparkContext(s"local[$NWorkers]",
> "HBaseTestsSparkContext")
>     val ctx = new SQLContext(sc)
>     import ctx._
>     case class MyTable(col1: String, col2: Byte)
>     val myRows = ctx.sparkContext.parallelize((Range(1,21).map{ix =>
>         MyTable(s"col1$ix", ix.toByte)
>     }))
>     val myRowsSchema = myRows.where("1=1")     // "Line 127"
>     val TempTabName = "MyTempTab"
>     myRowsSchema.registerTempTable(TempTabName)
>
>
> The above does not compile:
>
>
> Error:(127, 31) value where is not a member of
> org.apache.spark.rdd.RDD[MyTable]
>     val myRowsSchema = myRows.where("1=1")
>                               ^
>
> However copying the above code into  the Spark-Shell, it works - notice we
> get a Logical/PhysicalPlan from the heretefore "line 127":
>
> scala>     val myRowsSchema = myRows.where("1=1")
> myRowsSchema: org.apache.spark.sql.SchemaRDD =
> SchemaRDD[29] at RDD at SchemaRDD.scala:102
> == Query Plan ==
> == Physical Plan ==
> Filter 1=1
>  ExistingRdd [col1#8,col2#9], MapPartitionsRDD[27] at mapPartitions at
> basicOperators.scala:219
>
>
> So ..   what is the magic formula for setting up the imports for the
> SchemaRDD imports to work properly?
>
> 2014-10-02 2:00 GMT-07:00 Stephen Boesch <javadba@gmail.com>:
>
>
>> I am noticing disparities in behavior between the REPL and in my
>> standalone program in terms of implicit conversion of an RDD to SchemaRDD.
>>
>> In the REPL the following sequence works:
>>
>>
>> import sqlContext._
>>
>> val mySchemaRDD = myNormalRDD.where("1=1")
>>
>>
>> However when attempting similar in a standalone program it does not
>> compile -with message:
>>
>>   "value where is not a member of org.apache.spark.rdd.RDD[MyRecord]'
>>
>>
>> What is the required recipe for proper implict conversion  - given I have
>> done the import sqlContext._ in the standalone program as well but it is
>> not sufficient there.  Note: intellij IDE *does *seem to think that
>> "import sqlContext._" were  enough - it understands the implicit use of
>> "where". But even in IJ it does not actually compile. Rather strange.
>>
>
>

Mime
View raw message