hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Christophe Taton <ta...@wibidata.com>
Subject Re: Behavior of Filter.transform() in FilterList?
Date Tue, 02 Jul 2013 06:58:10 GMT
Hi,

I created https://issues.apache.org/jira/browse/HBASE-8847, with a small
patch (
https://github.com/kryzthov/hbase/commit/bd9a3b325d5d335fba04b5f7ce5f588e673cac91)
based on 0.94.8.
That seems to fix the problem on my side, but I would need to do some more
testing to ensure it doesn't introduce other unwanted side-effects.

C.


On Mon, Jul 1, 2013 at 7:54 PM, Ted Yu <yuzhihong@gmail.com> wrote:

> Christophe:
> Looks like you have clear idea of what to do.
>
> If you can show us in the form of patch, that would be nice.
>
> Cheers
>
> On Mon, Jul 1, 2013 at 7:17 PM, Christophe Taton <taton@wibidata.com>
> wrote:
>
> > On Mon, Jul 1, 2013 at 12:01 PM, lars hofhansl <larsh@apache.org> wrote:
> >
> > > It would make sense, but it is not immediately clear how to do so
> > cleanly.
> > > We would no longer be able to call transform at the StoreScanner level
> > (or
> > > evaluate the filter multiple times, or require the filters to maintain
> > > their - last - state and only apply transform selectively).
> > >
> >
> > I believe this change can be implemented directly in FilterList, without
> > requiring other changes.
> > A FilterList could compute its transformed KeyValue while applying
> > filterKeyValue() on the filter it contains, and return the pre-computed
> > transformed KeyValue in FilterList.transform() if it makes sense to do
> so.
> >
> > This means Filter.transform() is always applied immediately after a
> > filterKeyValue() with a return code that includes the KeyValue, and this
> > would be true for all filters in the hierarchy.
> >
> > C.
> >
> > I added transform() a while ago in order to allow a Filter *not* to
> > > transform. Before each we defensively made a copy of the key, just in
> > case
> > > a Filter (such as KeyOnlyFilter) would modify it, now this is a
> > formalized,
> > > and the filter is responsible for making a copy only when needed.
> > >
> > >
> > > -- Lars
> > >
> > >
> > >
> > > ________________________________
> > >  From: Christophe Taton <taton@wibidata.com>
> > > To: user@hbase.apache.org; lars hofhansl <larsh@apache.org>
> > > Sent: Monday, July 1, 2013 10:27 AM
> > > Subject: Re: Behavior of Filter.transform() in FilterList?
> > >
> > >
> > >
> > > On Mon, Jul 1, 2013 at 4:14 AM, lars hofhansl <larsh@apache.org>
> wrote:
> > >
> > > You want transform to only be called on filters that are "reached"?
> > > >I.e. FilterA and FilterB, FilterB.transform should not be called if a
> KV
> > > is already filtered by FilterA?
> > > >
> > >
> > > Yes, that's what I naively expected, at first.
> > >
> > > That's not how it works right now, transform is called in a completely
> > > different code path from the actual filtering logic.
> > > >
> > >
> > > Indeed, I just learned that.
> > > I found no documentation of this behavior, did I miss it?
> > > In particular, the javadoc of the workflow of Filter doesn't mention
> > > transform() at all.
> > > Would it make sense to apply transform() only if the return code for
> > > filterKeyValue() includes the KeyValue?
> > >
> > > C.
> > >
> > > -- Lars
> > > >
> > > >
> > > >----- Original Message -----
> > > >From: Christophe Taton <taton@wibidata.com>
> > > >To: user@hbase.apache.org
> > > >Cc:
> > > >Sent: Sunday, June 30, 2013 10:26 PM
> > > >Subject: Re: Behavior of Filter.transform() in FilterList?
> > > >
> > > >On Sun, Jun 30, 2013 at 10:15 PM, Ted Yu <yuzhihong@gmail.com> wrote:
> > > >
> > > >> The clause 'family=X and column=Y and KeyOnlyFilter' would be
> > > represented
> > > >> by a FilterList, right ?
> > > >> (family=A and colymn=B) would be represented by another FilterList.
> > > >>
> > > >
> > > >Yes, that would be FilterList(OR, [FilterList(AND, [family=X,
> column=Y,
> > > >KeyOnlyFilter]), FilterList(AND, [family=A, column=B])]).
> > > >
> > > >So the behavior is expected.
> > > >>
> > > >
> > > >Could you explain, I'm not sure how you reach this conclusion.
> > > >Are you saying it is expected, given the actual implementation
> > > >FilterList.transform()?
> > > >Or are there some other details I missed?
> > > >
> > > >Thanks!
> > > >C.
> > > >
> > > >On Mon, Jul 1, 2013 at 1:10 PM, Christophe Taton <taton@wibidata.com>
> > > wrote:
> > > >>
> > > >> > Hi,
> > > >> >
> > > >> > From
> > > >> >
> > > >> >
> > > >>
> > >
> >
> https://github.com/apache/hbase/blob/0.95/hbase-client/src/main/java/org/apache/hadoop/hbase/filter/FilterList.java#L183
> > > >> > ,
> > > >> > it appears that Filter.transform() is invoked unconditionally
on
> all
> > > >> > filters in a FilterList hierarchy.
> > > >> >
> > > >> > This is quite confusing, especially since I may construct a filter
> > > like:
> > > >> >     (family=X and column=Y and KeyOnlyFilter) or (family=A and
> > > colymn=B)
> > > >> > The KeyOnlyFilter will remove all values from the KeyValues in
A:B
> > as
> > > >> well.
> > > >> >
> > > >> > Is my understanding correct? Is this an expected/intended
> behavior?
> > > >> >
> > > >> > Thanks,
> > > >> > C.
> > > >> >
> > > >>
> > > >
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message