hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Amandeep Khurana <ama...@gmail.com>
Subject Re: MR on HDFS data inserted via HBase?
Date Thu, 14 Jan 2010 04:08:30 GMT
HBase has its own file format. Reading data from it in your own job will not
be trivial to write, but not impossible.

Why would you want to use the underlying data files in the MR jobs? Any
limitation in using the HBase api?

On Wed, Jan 13, 2010 at 8:06 PM, Otis Gospodnetic <
otis_gospodnetic@yahoo.com> wrote:

> Hello,
> If I import data into HBase, can I still run a hand-written MapReduce job
> over that data in HDFS?
> That is, not using TableInputFormat to read the data back out via HBase.
> Similarly, can one run Hive or Pig scripts against that data, but again,
> without Hive or Pig reading the data via HBase, but rather getting to it
> directly via HDFS?  I'm asking because I'm wondering whether storing data in
> HBase means I can no longer use Hive and Pig to run my ad-hoc jobs.
> Thanks,
> Otis
> --
> Sematext -- http://sematext.com/ -- Solr - Lucene - Nutch

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message