hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jean-Marc Spaggiari <jean-m...@spaggiari.org>
Subject Remove the row in MR job?
Date Fri, 12 Oct 2012 17:20:09 GMT

I have a table which I want to parse over a MR job.

Today, I'm using a scan to parse all the rows. Each row is retrieve,
removed, and the parsed (feeding 2 other tables)

The goal is to parse all the content while some process might still be
adding some more.

On the map method from the MR job, can I delete the row I'm working
with? If so, how should I do? should I take the table from the pool,
and simply call the delete method? The issue is, doing a delete for
each line will take a while. I would prefer to batch them, but I don't
know when will be the last line, so it's difficult to know when to
send the batch.  Is there a way to say to the MR job to delete this
line? Also, what's the impact on the MR job if I delete the row it's
working one?

Or is the MR job not the best way to do that?



View raw message