spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Daniil Osipov <daniil.osi...@shazam.com>
Subject [spark-sql] JsonRDD
Date Tue, 03 Feb 2015 00:16:47 GMT
Hey Spark developers,

Is there a good reason for JsonRDD being a Scala object as opposed to
class? Seems most other RDDs are classes, and can be extended.

The reason I'm asking is that there is a problem with Hive interoperability
with JSON DataFrames where jsonFile generates case sensitive schema, while
Hive expects case insensitive and fails with an exception during
saveAsTable if there are two columns with the same name in different case.

I'm trying to resolve the problem, but that requires me to extend JsonRDD,
which I can't do. Other RDDs are subclass friendly, why is JsonRDD
different?

Dan

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message