spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Umesh Kacha <umesh.ka...@gmail.com>
Subject Re: How to create Spark DataFrame using custom Hadoop InputFormat?
Date Sat, 01 Aug 2015 04:52:07 GMT
Hi thanks Void works I use same custom format in Hive and it works with
Void as key. Please share example if you have to create DataFrame using
custom Hadoop format.
On Aug 1, 2015 2:07 AM, "Ted Yu" <yuzhihong@gmail.com> wrote:

> I don't think using Void class is the right choice - it is not even a
> Writable.
>
> BTW in the future, capture text output instead of image.
>
> Thanks
>
> On Fri, Jul 31, 2015 at 12:35 PM, Umesh Kacha <umesh.kacha@gmail.com>
> wrote:
>
>> Hi Ted thanks My key is always Void because my custom format file is non
>> splittable so key is Void and values is  MyRecordWritable which extends
>> Hadoop Writable. I am sharing my log as snap please dont mind as I cant
>> paste code outside.
>>
>> Regards,
>> Umesh
>>
>> On Sat, Aug 1, 2015 at 12:59 AM, Ted Yu <yuzhihong@gmail.com> wrote:
>>
>>> Looking closer at the code you posted, the error likely was caused by
>>> the 3rd parameter: Void.class
>>>
>>> It is supposed to be the class of key.
>>>
>>> FYI
>>>
>>> On Fri, Jul 31, 2015 at 11:24 AM, unk1102 <umesh.kacha@gmail.com> wrote:
>>>
>>>> Hi I am having my own Hadoop custom InputFormat which I need to use in
>>>> creating DataFrame. I tried to do the following
>>>>
>>>> JavaPairRDD<Void,MyRecordWritable> myFormatAsPairRdd =
>>>>
>>>> jsc.hadoopFile("hdfs://tmp/data/myformat.xyz",MyInputFormat.class,Void.class,MyRecordWritable.class);
>>>> JavaRDD<MyRecordWritable> myformatRdd =  myFormatAsPairRdd.values();
>>>> DataFrame myFormatAsDataframe =
>>>> sqlContext.createDataFrame(myformatRdd,MyFormatSchema.class);
>>>> myFormatAsDataframe.show();
>>>>
>>>> Above code does not work and throws exception saying
>>>> java.lang.IllegalArgumentException object is not an instance of
>>>> declaring
>>>> class
>>>>
>>>> My custom Hadoop InputFormat works very well with Hive,MapReduce etc
>>>> How do
>>>> I make it work with Spark please guide I am new to Spark. Thank in
>>>> advance.
>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> View this message in context:
>>>> http://apache-spark-user-list.1001560.n3.nabble.com/How-to-create-Spark-DataFrame-using-custom-Hadoop-InputFormat-tp24101.html
>>>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
>>>> For additional commands, e-mail: user-help@spark.apache.org
>>>>
>>>>
>>>
>>
>

Mime
View raw message