spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From innowireless TaeYun Kim <taeyun....@innowireless.co.kr>
Subject RE: Bulk-load to HBase
Date Fri, 19 Sep 2014 11:22:26 GMT
Hi,

 

Sorry, I just found saveAsNewAPIHadoopDataset.

Then, Can I use HFileOutputFormat with saveAsNewAPIHadoopDataset? Is there
any example code for that?

 

Thanks.

 

From: innowireless TaeYun Kim [mailto:taeyun.kim@innowireless.co.kr] 
Sent: Friday, September 19, 2014 8:18 PM
To: user@spark.apache.org
Subject: RE: Bulk-load to HBase

 

Hi,

 

After reading several documents, it seems that saveAsHadoopDataset cannot
use HFileOutputFormat.

It's because saveAsHadoopDataset method uses JobConf, so it belongs to the
old Hadoop API, while HFileOutputFormat is a member of mapreduce package
which is for the new Hadoop API.

 

Am I right?

If so, is there another method to bulk-load to HBase from RDD?

 

Thanks.

 

From: innowireless TaeYun Kim [mailto:taeyun.kim@innowireless.co.kr] 
Sent: Friday, September 19, 2014 7:17 PM
To: user@spark.apache.org
Subject: Bulk-load to HBase

 

Hi,

 

Is there a way to bulk-load to HBase from RDD?

HBase offers HFileOutputFormat class for bulk loading by MapReduce job, but
I cannot figure out how to use it with saveAsHadoopDataset.

 

Thanks.


Mime
View raw message