Hi,

On Fri, Jan 16, 2015 at 5:55 PM, Wang, Daoyuan <daoyuan.wang@intel.com> wrote:

Can you provide how you create the JsonRDD?


This should be reproducible in the Spark shell:

---------------------------------------------------------
import org.apache.spark.sql._
val sqlc = new SparkContext(sc)
val rdd = sc.parallelize("""{"Click":"nonclicked", "Impression":1, "DisplayURL":4401798909506983219, "AdId":21215341}""" ::
                         """{"Click":"nonclicked", "Impression":1, "DisplayURL":14452800566866169008, "AdId":10587781}""" :: Nil)

// works fine
val json = sqlc.jsonRDD(rdd)
json.registerTempTable("test")
sqlc.sql("SELECT * FROM test").collect

// -> MatchError
val json2 = sqlc.jsonRDD(rdd, 0.1)
json2.registerTempTable("test2")
sqlc.sql("SELECT * FROM test2").collect
---------------------------------------------------------

I guess the issue in the latter case is that the column is inferred as Long when some rows actually are too big for Long...

Thanks
Tobias