hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ananth T. Sarathy" <ananth.t.sara...@gmail.com>
Subject Re: Waiting forever on scanner iterator
Date Wed, 21 Oct 2009 14:19:25 GMT
Anyone have any further thoughts on this?
Ananth T Sarathy


On Tue, Oct 20, 2009 at 6:37 PM, Ananth T. Sarathy <
ananth.t.sarathy@gmail.com> wrote:

> Well that's not the case. Every Row has that column.  In fact the second
> snippet i sent  is with a column with many less rows. (1k vs 25k) but comes
> back pretty quickly.
>
> By forever, I mean i have watched my logs do nothing for a half hour before
> giving up.
>
>
> Ananth T Sarathy
>
>
>
> On Tue, Oct 20, 2009 at 5:03 PM, Ryan Rawson <ryanobjc@gmail.com> wrote:
>
>> If you are asking for a column that is very sparse and doesnt exist,
>> it will cause HBase to read through the entire region to find 100
>> matching rows. This could take a while, you said 'forever', but could
>> you quantify that?
>>
>> On Tue, Oct 20, 2009 at 1:58 PM, Jean-Daniel Cryans <jdcryans@apache.org>
>> wrote:
>> > Scanner pre-fetching is always faster, so something must be wrong with
>> > your region server. Check the logs, top, etc
>> >
>> > WRT to row size, it's pretty much a matter of how many bytes you have
>> > in each column and sum them up (plus some overhead with the keys).
>> >
>> > You want filters, check the filter package in the javadoc.
>> >
>> > J-D
>> >
>> > On Tue, Oct 20, 2009 at 1:52 PM, Ananth T. Sarathy
>> > <ananth.t.sarathy@gmail.com> wrote:
>> >> Ok, but how come
>> >> when I run a similiar call (with less returned rows 1000 vs 25k in the
>> >> previous one) it runs through the iterator very quickly?  (See Below)
>> >>
>> >> Also, how do I determine the row size? It's just text data, and really
>> not
>> >> much.
>> >>
>> >> Finally, is there a way to query for rows that do not have a column?
>> (Ie all
>> >> rows without Files:path1)
>> >>
>> >>        HBaseTableDataManagerImpl htdmni = new
>> HBaseTableDataManagerImpl(
>> >>                "GS_Applications");
>> >>
>> >>        String[] columns = { "Files:path1" };
>> >>        log.info("Getting all Rows with Files");
>> >>        Scanner s = htdmni.getScannerForAllRows(columns);
>> >>        log.info("Got all Rows with Files");
>> >>
>> >>        Iterator<RowResult> iter = s.iterator();
>> >>        out
>> >>
>> >>
>> .write("Application_Full_Name,Version,Application_installer_name,Operating
>> >> System, Application_platform
>> >>
>> ,Application_sub_category,md5Hash,Sha1Hash,Sha256Hash,filepath,fileName,modified,size,operation\n");
>> >>        out.write("<BR>");
>> >>        while (iter.hasNext())
>> >>        {
>> >>
>> >> Ananth T Sarathy
>> >>
>> >>
>> >> On Tue, Oct 20, 2009 at 4:44 PM, Jean-Daniel Cryans <
>> jdcryans@apache.org>wrote:
>> >>
>> >>> If you have a very slow data source (S3), then it fetches 100 row
>> >>> before coming back to your client with all of them and that can take
a
>> >>> lot of time. Also make sure that 100 of your rows can fit in a region
>> >>> server's memory. How big is each row?
>> >>>
>> >>> J-D
>> >>>
>> >>> On Tue, Oct 20, 2009 at 1:32 PM, Ananth T. Sarathy
>> >>> <ananth.t.sarathy@gmail.com> wrote:
>> >>> > I am running this code where
>> >>> >
>> >>> > getScannerForAllRows(columns) just does return
>> table.getScanner(columns);
>> >>> >
>> >>> > and the table   has setScannerCaching(100);
>> >>> >
>> >>> > But it spins forever after getting the iterator. Why would that
be?
>> How
>> >>> can
>> >>> > I speed it up?
>> >>> >
>> >>> >        HBaseTableDataManagerImpl htdmni = new
>> HBaseTableDataManagerImpl(
>> >>> >                "GS_Applications");
>> >>> >
>> >>> >        String[] columns = { "Files:Name" };
>> >>> >        log.info("Getting all Rows with Files");
>> >>> >        Scanner s = htdmni.getScannerForAllRows(columns);
>> >>> >        log.info("Got all Rows with Files");
>> >>> >        log.info("Getting Iterator");
>> >>> >
>> >>> >        Iterator<RowResult> iter = s.iterator();
>> >>> >        log.info("Got Iterator");
>> >>> >
>> >>> >        while (iter.hasNext())
>> >>> >        {
>> >>> >            log.info("Getting next Row");
>> >>> >            RowResult rr = iter.next();
>> >>> >
>> >>> >
>> >>> > Ananth T Sarathy
>> >>> >
>> >>>
>> >>
>> >
>>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message