spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From innowireless TaeYun Kim <taeyun....@innowireless.co.kr>
Subject RE: Bulk-load to HBase
Date Fri, 19 Sep 2014 11:17:33 GMT
Hi,

 

After reading several documents, it seems that saveAsHadoopDataset cannot
use HFileOutputFormat.

It's because saveAsHadoopDataset method uses JobConf, so it belongs to the
old Hadoop API, while HFileOutputFormat is a member of mapreduce package
which is for the new Hadoop API.

 

Am I right?

If so, is there another method to bulk-load to HBase from RDD?

 

Thanks.

 

From: innowireless TaeYun Kim [mailto:taeyun.kim@innowireless.co.kr] 
Sent: Friday, September 19, 2014 7:17 PM
To: user@spark.apache.org
Subject: Bulk-load to HBase

 

Hi,

 

Is there a way to bulk-load to HBase from RDD?

HBase offers HFileOutputFormat class for bulk loading by MapReduce job, but
I cannot figure out how to use it with saveAsHadoopDataset.

 

Thanks.


Mime
View raw message