hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From lars hofhansl <la...@apache.org>
Subject Re: Behavior of Filter.transform() in FilterList?
Date Mon, 01 Jul 2013 19:01:04 GMT
It would make sense, but it is not immediately clear how to do so cleanly. We would no longer
be able to call transform at the StoreScanner level (or evaluate the filter multiple times,
or require the filters to maintain their - last - state and only apply transform selectively).

I added transform() a while ago in order to allow a Filter *not* to transform. Before each
we defensively made a copy of the key, just in case a Filter (such as KeyOnlyFilter) would
modify it, now this is a formalized, and the filter is responsible for making a copy only
when needed.

-- Lars

 From: Christophe Taton <taton@wibidata.com>
To: user@hbase.apache.org; lars hofhansl <larsh@apache.org> 
Sent: Monday, July 1, 2013 10:27 AM
Subject: Re: Behavior of Filter.transform() in FilterList?

On Mon, Jul 1, 2013 at 4:14 AM, lars hofhansl <larsh@apache.org> wrote:

You want transform to only be called on filters that are "reached"?
>I.e. FilterA and FilterB, FilterB.transform should not be called if a KV is already filtered
by FilterA?

Yes, that's what I naively expected, at first.

That's not how it works right now, transform is called in a completely different code path
from the actual filtering logic.

Indeed, I just learned that.
I found no documentation of this behavior, did I miss it?
In particular, the javadoc of the workflow of Filter doesn't mention transform() at all.
Would it make sense to apply transform() only if the return code for filterKeyValue() includes
the KeyValue?


-- Lars
>----- Original Message -----
>From: Christophe Taton <taton@wibidata.com>
>To: user@hbase.apache.org
>Sent: Sunday, June 30, 2013 10:26 PM
>Subject: Re: Behavior of Filter.transform() in FilterList?
>On Sun, Jun 30, 2013 at 10:15 PM, Ted Yu <yuzhihong@gmail.com> wrote:
>> The clause 'family=X and column=Y and KeyOnlyFilter' would be represented
>> by a FilterList, right ?
>> (family=A and colymn=B) would be represented by another FilterList.
>Yes, that would be FilterList(OR, [FilterList(AND, [family=X, column=Y,
>KeyOnlyFilter]), FilterList(AND, [family=A, column=B])]).
>So the behavior is expected.
>Could you explain, I'm not sure how you reach this conclusion.
>Are you saying it is expected, given the actual implementation
>Or are there some other details I missed?
>On Mon, Jul 1, 2013 at 1:10 PM, Christophe Taton <taton@wibidata.com> wrote:
>> > Hi,
>> >
>> > From
>> >
>> >
>> https://github.com/apache/hbase/blob/0.95/hbase-client/src/main/java/org/apache/hadoop/hbase/filter/FilterList.java#L183
>> > ,
>> > it appears that Filter.transform() is invoked unconditionally on all
>> > filters in a FilterList hierarchy.
>> >
>> > This is quite confusing, especially since I may construct a filter like:
>> >     (family=X and column=Y and KeyOnlyFilter) or (family=A and colymn=B)
>> > The KeyOnlyFilter will remove all values from the KeyValues in A:B as
>> well.
>> >
>> > Is my understanding correct? Is this an expected/intended behavior?
>> >
>> > Thanks,
>> > C.
>> >
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message