hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stack <st...@duboce.net>
Subject Re: help with filters
Date Wed, 27 Jan 2010 01:01:14 GMT
On Tue, Jan 26, 2010 at 4:51 PM, Chris Bates
<christopher.andrew.bates@gmail.com> wrote:
>

Must pass all "works" because there's a unit test that asserts so?
I'm not sure what it is about your data profile that is messing with
this functionality.  Its something involved where my guess is the only
way to figure it is to set up some kinda harness and step through the
debugger.  Any chance of your having a go at that Chris?

Thanks,
St.Ack


> Second, I'm still not able to get the AND operation working.
>
> To illustrate:
>
> hbase(main):010:0> scan 'testTable', {COLUMNS=>["user:theme",
> "user:REMOTE_ADDR"]}
> ROW                          COLUMN+CELL
>
>  row1                        column=user:REMOTE_ADDR,
> timestamp=1264464021672, value=172.16.1.3
>  row1                        column=user:theme, timestamp=1264464041857,
> value=Frost
>  row2                        column=user:theme, timestamp=1264464058064,
> value=Sunshine
>  row3                        column=user:REMOTE_ADDR,
> timestamp=1264464083332, value=172.16.0.06
>
> With MUST_PASS_ALL enabled...
>
> If I comment out the REMOTE_ADDR filter, I get:
> IP: null Theme: Frost
> IP: null Theme: Sunshine
>
> If I comment out the theme filter, I get the reverse.
> IP: 172.16.1.3 Theme: null
> IP: 172.16.0.06 Theme: null
>
> If I leave both in, I get __nothing__, when I want:
> IP: 172.16.1.3 Theme: Frost
>
> I thought this might be due to HBase not being able to do an AND operation
> on Qualifiers of the same column, so I created another testTable2 with two
> different columns:
>
> hbase(main):024:0> scan 'testTable2'
> ROW                          COLUMN+CELL
>
>  row1                        column=addr:REMOTE_ADDR,
> timestamp=1264552425218, value=172.16.1.3
>  row1                        column=user:theme, timestamp=1264552375737,
> value=Frost
>  row2                        column=user:theme, timestamp=1264552505491,
> value=Sunshine
>  row3                        column=addr:REMOTE_ADDR,
> timestamp=1264552538651, value=172.16.0.36
>
> But nothing changed.
>
>
> Any other thoughts?  The only solution I can see to get this done is to
> implement a row counter for each column+qualifier and then store the results
> that meet criteria that I expect, but I was hoping a native filter would do
> the job.
>
>
> On Mon, Jan 25, 2010 at 8:43 PM, Stack <stack@duboce.net> wrote:
>
>> See the TestFilterList under unit tests, src/test.  Can you mess
>> around with it using your data and see if it tells you anything?
>> There's a testMPALL in there.   Might give you a clue (Your code looks
>> fine)
>>
>> St.Ack
>>
>> On Mon, Jan 25, 2010 at 4:25 PM, Chris Bates
>> <christopher.andrew.bates@gmail.com> wrote:
>> > thanks stack. i upgraded to the RC3 0.20.3.
>> >
>> > I was still getting the hanging, so I decided to create a real simple
>> table
>> > to try to see if I can get the logic working:
>> >
>> > hbase(main):031:0> scan 'testTable'
>> > ROW                          COLUMN+CELL
>> >
>> >  row1                        column=user:REMOTE_ADDR,
>> > timestamp=1264464021672, value=172.16.1.3
>> >  row1                        column=user:theme, timestamp=1264464041857,
>> > value=Frost
>> >  row2                        column=user:theme, timestamp=1264464058064,
>> > value=Sunshine
>> >  row3                        column=user:REMOTE_ADDR,
>> > timestamp=1264464083332, value=172.16.0.06
>> >
>> > Without the filter (http://pastebin.com/m20ba0d2d) this is my output
>> > client-side:
>> > IP: 172.16.1.3
>> > Theme: Frost
>> > IP: null
>> > Theme: Sunshine
>> > IP: 172.16.0.06
>> > Theme: null
>> >
>> > If I uncomment the setFilter, I get nothing.  I'm expecting to get the
>> first
>> > two lines (row1).  Thus I don't believe my filters are setup correctly,
>> but
>> > I'm unsure where the error would be.
>> >
>> > Does anyone have any thoughts or examples?
>> >
>> > Thanks!
>> >
>> >
>> > On Mon, Jan 25, 2010 at 1:45 PM, Stack <stack@duboce.net> wrote:
>> >
>> >> Check out the CHANGES in 0.20.2 and even in 0.20.3RC3:
>> >>
>> >>
>> http://svn.apache.org/viewvc/hadoop/hbase/branches/0.20/CHANGES.txt?view=log
>> >> .
>> >>  I believe what your issue fixed.
>> >> St.Ack
>> >>
>> >> On Mon, Jan 25, 2010 at 10:36 AM, Chris Bates
>> >> <christopher.andrew.bates@gmail.com> wrote:
>> >> > 0.20.1
>> >> >
>> >> > On Mon, Jan 25, 2010 at 1:31 PM, Stack <stack@duboce.net> wrote:
>> >> >
>> >> >> What version of HBase?
>> >> >> St.Ack
>> >> >>
>> >> >> On Sat, Jan 23, 2010 at 7:49 PM, Chris Bates
>> >> >> <christopher.andrew.bates@gmail.com> wrote:
>> >> >> > Hi all,
>> >> >> >
>> >> >> > I'm trying to do an AND operation and I'm not sure if I did
the
>> >> filtering
>> >> >> > correctly because HBase is hanging on me.
>> >> >> >
>> >> >> > What I want is this:
>> >> >> >
>> >> >> > I have two qualifiers, theme and IP, to my column user.  I'd
like
>> to
>> >> >> print
>> >> >> > out all matches (or maybe just 10) where the row has both
of them
>> in
>> >> it.
>> >> >>  My
>> >> >> > impression is that this is what HBase would excel at, because
the
>> >> dataset
>> >> >> is
>> >> >> > VERY sparse, meaning that out of 1000-10,000 rows, maybe just
1 or
>> 2
>> >> will
>> >> >> > have BOTH an IP and a theme in it.  Most of the time its
just one
>> or
>> >> the
>> >> >> > other.
>> >> >> >
>> >> >> > So this is my code to make that query, but as I said, its
hanging.
>> >> >> > http://pastebin.com/m7fcef49
>> >> >> >
>> >> >> > If I comment out the filters, the query runs just fine and
will
>> print
>> >> >> null
>> >> >> > wherever the value is not present.
>> >> >> >
>> >> >>
>> >> >
>> >>
>> >
>>
>

Mime
View raw message