spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tobias Pfeiffer <...@preferred.jp>
Subject Re: MatchError in JsonRDD.toLong
Date Fri, 16 Jan 2015 09:10:52 GMT
Hi,

On Fri, Jan 16, 2015 at 5:55 PM, Wang, Daoyuan <daoyuan.wang@intel.com>
wrote:
>
> Can you provide how you create the JsonRDD?
>

This should be reproducible in the Spark shell:

---------------------------------------------------------
import org.apache.spark.sql._
val sqlc = new SparkContext(sc)
val rdd = sc.parallelize("""{"Click":"nonclicked", "Impression":1,
"DisplayURL":4401798909506983219, "AdId":21215341}""" ::
                         """{"Click":"nonclicked", "Impression":1,
"DisplayURL":14452800566866169008, "AdId":10587781}""" :: Nil)

// works fine
val json = sqlc.jsonRDD(rdd)
json.registerTempTable("test")
sqlc.sql("SELECT * FROM test").collect

// -> MatchError
val json2 = sqlc.jsonRDD(rdd, 0.1)
json2.registerTempTable("test2")
sqlc.sql("SELECT * FROM test2").collect
---------------------------------------------------------

I guess the issue in the latter case is that the column is inferred as Long
when some rows actually are too big for Long...

Thanks
Tobias

Mime
View raw message