spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Daniil Osipov <daniil.osi...@shazam.com>
Subject Re: [spark-sql] JsonRDD
Date Tue, 03 Feb 2015 17:15:46 GMT
Thanks Reynold,

Case sensitivity issues are definitely orthogonal. I'll submit a bug or PR.

Is there a way to rename the object to eliminate the confusion? Not sure
how locked down the API is at this time, but it seems like a potential
confusion point for developers.

On Mon, Feb 2, 2015 at 4:30 PM, Reynold Xin <rxin@databricks.com> wrote:

> It's bad naming - JsonRDD is actually not an RDD. It is just a set of util
> methods.
>
> The case sensitivity issues seem orthogonal, and would be great to be able
> to control that with a flag.
>
>
> On Mon, Feb 2, 2015 at 4:16 PM, Daniil Osipov <daniil.osipov@shazam.com>
> wrote:
>
>> Hey Spark developers,
>>
>> Is there a good reason for JsonRDD being a Scala object as opposed to
>> class? Seems most other RDDs are classes, and can be extended.
>>
>> The reason I'm asking is that there is a problem with Hive
>> interoperability
>> with JSON DataFrames where jsonFile generates case sensitive schema, while
>> Hive expects case insensitive and fails with an exception during
>> saveAsTable if there are two columns with the same name in different case.
>>
>> I'm trying to resolve the problem, but that requires me to extend JsonRDD,
>> which I can't do. Other RDDs are subclass friendly, why is JsonRDD
>> different?
>>
>> Dan
>>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message