spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matei Zaharia <matei.zaha...@gmail.com>
Subject Re: Write to HBase from spark job
Date Sat, 12 Oct 2013 17:42:38 GMT
Hi Eugen,

You should use saveAsHadoopDataset, to which you pass a JobConf object that you've configured
with TableOutputFormat the same way you would for a MapReduce job. The saveAsHadoopFile methods
are specifically for output formats that go to a filesystem (e.g. HDFS), but HBase isn't a
filesystem.

Matei

On Oct 11, 2013, at 8:53 AM, Eugen Cepoi <cepoi.eugen@gmail.com> wrote:

> Hi there,
> 
> I have got a few questions on how best to write to HBase from a spark job.
> 
> - If we want to write using TableOutputFormat are we supposed to use saveAsNewAPIHadoopFile?
> - Or should we do it by hand (without TableOutputFormat) in a foreach loop for example?
> - Or should use HFileOutputFormat with saveAsNewAPIHadoopFile?
> 
> Thanks,
> Eugen


Mime
View raw message