spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jörn Franke <jornfra...@gmail.com>
Subject Re: Reading Hive RCFiles?
Date Sat, 20 Jan 2018 23:55:53 GMT
Forgot to add the mailinglist 

> On 18. Jan 2018, at 18:55, Jörn Franke <jornfranke@gmail.com> wrote:
> 
> Welll you can use:
> https://spark.apache.org/docs/latest/api/java/org/apache/spark/SparkContext.html#hadoopRDD-org.apache.hadoop.mapred.JobConf-java.lang.Class-java.lang.Class-java.lang.Class-int-
> 
> with the following inputformat:
> https://hive.apache.org/javadocs/r2.1.1/api/org/apache/hadoop/hive/ql/io/RCFileInputFormat.html
> 
> (note the version of the Javadoc does not matter it is already possible since a long
time).
> 
> Writing is similarly with PairRDD and RCFileOutputFormat
> 
>> On Thu, Jan 18, 2018 at 5:02 PM, Michael Segel <msegel_hadoop@hotmail.com>
wrote:
>> No idea on how that last line of garbage got in the message.
>> 
>> 
>> > On Jan 18, 2018, at 9:32 AM, Michael Segel <msegel_hadoop@hotmail.com>
wrote:
>> >
>> > Hi,
>> >
>> > I’m trying to find out if there’s a simple way for Spark to be able to read
an RCFile.
>> >
>> > I know I can create a table in Hive, then drop the files in to that directory
and use a sql context to read the file from Hive, however I wanted to read the file directly.
>> >
>> > Not a lot of details to go on… even the Apache site’s links are broken.
>> > See :
>> > https://cwiki.apache.org/confluence/display/Hive/RCFile
>> >
>> > Then try to follow the Javadoc link.
>> >
>> >
>> > Any suggestions?
>> >
>> > Thx
>> >
>> > -Mike
>> >
>> >
> 

Mime
View raw message