hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ramkrishna.S.Vasudevan" <ramkrishna.vasude...@huawei.com>
Subject RE: Scan addFamily vs FamilyFilter(EQUAL, ...)
Date Thu, 31 May 2012 11:38:57 GMT
Just to add on.
The java doc clearly says in FamilyFilter that 

* If an already known column family is looked for, use {@link
* directly rather than a filter.

So addFamily should be better.


> -----Original Message-----
> From: Anoop Sam John [mailto:anoopsj@huawei.com]
> Sent: Thursday, May 31, 2012 11:49 AM
> To: user@hbase.apache.org
> Subject: RE: Scan addFamily vs FamilyFilter(EQUAL, ...)
> Hi,
>      As per my understanding of the Scan code in your scenario where
> you want to go with scanning of some CFs ( not all)  You go with
> Scan#addFamily.
> The FamilyFilter also doing the same thing. But there is a difference
> in the performance.
> When one specify the CFs in the scan,  the scanner will be created for
> only those many Stores. For the other CFs, there wont be any scanners
> and so those stores are not scanned. ( The HFile data is not fetched )
> Instead when one use the FamilyFilter and not specify any specific
> columns (using Scan#addFamily) all the stores will get scanned and data
> will get fetched from HFiles. Later these KVs corresponding to which
> you needed (as per your FamilyFilter)  only will get included in the
> Result and others just avoided.  So there will be performance
> difference I feel..   Correct me if I am wrong pls...
> @Stack
> >One thing I ran into when using the Scan.addFamily / Scan.addColumn is
> that those two methods overwrite each other.
> In the Scan#addColumn javadoc it is clearly telling about this
> overwrites...   So this seems intentionally done correct?
> -Anoop-
> ________________________________________
> From: saint.ack@gmail.com [saint.ack@gmail.com] on behalf of Stack
> [stack@duboce.net]
> Sent: Wednesday, May 30, 2012 11:13 PM
> To: user@hbase.apache.org
> Subject: Re: Scan addFamily vs FamilyFilter(EQUAL, ...)
> On Wed, May 30, 2012 at 9:59 AM, Kevin <kevin.macksamie@gmail.com>
> wrote:
> > I am curious and trying to learn which method is best when wanting to
> limit
> > a scan to a particular column or column family. The Scan class
> carries a
> > Filter instance and a TreeMap of the family map and I am unsure how
> they
> > get carried through to the server-side functionality. In terms of
> > performance is there any difference between doing Scan.addFamily(x)
> and
> > Scan.setFilter(new FamilyFilter(CompareFilter.CompareOp.EQUAL, x)?
> >
> There is probably not noticeable difference in performance but
> Scan#addFamily is the more natural way of expressing column family
> scoping.
> St.Ack

View raw message