hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From stack <st...@duboce.net>
Subject Re: Batch update gain
Date Tue, 15 Apr 2008 16:09:24 GMT
David Alves wrote:
> Hi All
>             I'm currently rewriting my own TableOutputFormat classes to
> comply with the new APIs introduced in the latest version and I was
> wondering if it would be valuable to rewrite them as buffered writers,
> meaning keeping a predetermined set of records (set by size to avoid OOME)
> before commiting them to HBase.

Commits are by row.  Are you talking of batching up rows before 
forwarding them to hbase?

>             What are your thoughs about this?
>             In another note I think it would be valuable to rewrite the
> TableInputFormat class to be extendable. For example in my case I needed a
> Filtered (RegExpRowFilter) TableInputFormat and could not extend the
> original because its instance of HTable is package protected.
This needs to be done before 0.2.0 release.   Its been on my mind.  I 
just made a JIRA for it.  Dump any thoughts you have on how it might 
work into hbase-581.  At a minimum, at note on what currently prevents 
your being able to subclass.

If you are currently working on this, I could do the hbase end for you.  
Just say.


View raw message