hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Clint Morgan" <clint....@gmail.com>
Subject Re: Filter omitting columns
Date Tue, 25 Mar 2008 23:27:32 GMT
Thats odd, when I use the RegExpRowFilter, and it filters based on a
column's value, the whole row is dropped from the results (as you
expected).

So to answer your question, you should not have to cater to this
manually, but instead let the filter do it.

One thing in your examples that strikes me as odd is to identify
columns solely based on their family name. As Jim pointed out, this
works in scanner construction, but I'm not sure it would work
everywhere else (eg, inside filters). I would try to use fully
qualified column names (family:name)...


On Fri, Mar 21, 2008 at 6:02 AM, Goel, Ankur <Ankur.Goel@corp.aol.com> wrote:
> Clint,
>       Thanks! The patch works and I am able to get the remaining columns
>  successfully.
>  One question though, it seems like the behaviour is different from what
>  I expected.
>  I wanted something like
>
>  Select url:, status:, from mytable where status=0;
>
>  It is correctly able to filter out the status values <> 0 but the 'url:'
>
>  column values are untouched !
>
>  Coming from an RDBMS background, I was hoping that the application of
>  column
>  filter would yield similar result! (fetching only those 'url:' column
>  values
>  for which 'status:' matched)
>
>  In a column oriented database like Hbase, should such scenario be
>  catered manually ?
>  For e.g ignoring 'url:' values if 'status:' value got filtered ?
>
>  Is there a tutorial available that shows 2-3 tables in RDBMS style and
>  then transforms them into column oriented design ?
>
>  Thanks
>  -Ankur
>
>
>
>
>
>
>  -----Original Message-----
>  From: Clint Morgan [mailto:clint.a.m@gmail.com]
>
>
> Sent: Thursday, March 20, 2008 10:11 PM
>  To: hbase-user@hadoop.apache.org
>  Subject: Re: Filter omitting columns
>
>  Thats great, but it won't do what he wants, as he want only rows where
>  the value for the status: column  equals getBytes(1). Hence the filter.
>
>  On Thu, Mar 20, 2008 at 9:30 AM, Jim Kellerman <jim@powerset.com> wrote:
>  > If you want to get all the family members, you can just specify
>  > 'familyname:' as the column. This creates a wild-card scanner  which
>  > will do what you want without filters.
>  >
>  >  ---
>  >  Jim Kellerman, Senior Engineer; Powerset
>  >
>  >
>  >
>  >
>  >  > -----Original Message-----
>  >  > From: Clint Morgan [mailto:clint.a.m@gmail.com]  > Sent: Thursday,
>  > March 20, 2008 9:19 AM  > To: hbase-user@hadoop.apache.org  > Subject:
>
>  > Re: Filter omitting columns  >  > I was having a similar problem as
>  > well. Though I've never  > used just the column families to specify
>  > the columns (eg  > always fully qualified col names like family:col) .
>
>  > Maybe you  > can try my patch and see if it fixes your problem.
>  >  >
>  >  > https://issues.apache.org/jira/browse/HBASE-527
>  >  >
>  >  > Also you can give a null value for the row key regexp if you  >
>  > don't want to use it in RegExpRowFilter.
>  >  >
>  >  > -clint
>  >  >
>  >  > On Thu, Mar 20, 2008 at 7:23 AM, Goel, Ankur  >
>  > <Ankur.Goel@corp.aol.com> wrote:
>  >  > >
>  >  > >  Hi,
>  >  > >    I am trying to obtain a set of rows by obtaining a scanner  on
>  >  > > Htable. I also specify the RowFilterCriteria like this.
>  >  > >
>  >  > >  /* Code Start */
>  >  > >  Map<Text, byte[]> columnFilter = new HashMap<Text, byte[]>();
 >
>
>  > > columnFilter.put(new Text("status:"), getBytes(1));  > >
>  > RowFilterInterface rowFilter = new RegExpRowFilter(".*",  > >
>  > columnFilter);  > >  > >  HTable myTable = new HTable(conf, new
>  > Text("myTable"));  Text[]  > > columns = {new Text("url:"), new
>  > Text("status:")};  > > myTable.obtainScanner(columns,
>  > HConstants.EMPTY_START_ROW,  > rowFilter);  > >  /* Code End */  >
>
>  > > >  When I scan the table, I only get 'status:' column family and its
>
>  > > > values.
>  >  > >  The 'url:' family is not present.
>  >  > >
>  >  > >  In simple SQL the query translates to something like  > >  >
>
>  > SELECT url, status FROM mytable WHERE status=1;  > >  > >  What could
>  > be wrong ?
>  >  > >
>  >  > >  I eventually want to do something like this  > >  > > 
SELECT
>  > url, status, date FROM mytable WHERE (status=1 or (status=2  > > and
>  > [today's date] > date));  > >  > >  I have used RowFilterSet with
>  > RowFilterSet.Operation options to  > > accomplish this but the
>  > omission  of columns not on the  > filter column  > > list by the
>  > filter beats me.
>  >  > >
>  >  > >  Thanks
>  >  > >  -Ankur
>  >  > >
>  >  >
>  >  > No virus found in this incoming message.
>  >  > Checked by AVG.
>  >  > Version: 7.5.519 / Virus Database: 269.21.7/1335 - Release  > Date:
>
>  > 3/19/2008 9:54 AM  >  >
>  >
>  >  No virus found in this outgoing message.
>  >  Checked by AVG.
>  >  Version: 7.5.519 / Virus Database: 269.21.7/1335 - Release Date:
>  > 3/19/2008 9:54 AM
>  >
>  >
>

Mime
View raw message