hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Erik Holstad" <erikhols...@gmail.com>
Subject Re: How to get all columns from the scanner in a Map-Reduce job?
Date Mon, 20 Oct 2008 18:13:12 GMT
Hi Stack!
Will try that fix, opened up a Jira-941 in the meantime.

Regards Erik




On Sun, Oct 19, 2008 at 4:05 PM, Michael Stack <stack@duboce.net> wrote:

> What happens if you pass a column name of "^.*$"?  Will it return all
> columns?  I don't think it will.  IIRC the regex can only be applied to the
> column qualifier portion of column name which means you'd have to write out
> a column spec. for your mapreduce job per column family.  So, if you had
> three famlies but each had a thousand columns, if you write a column
> specification of "family1:.* family2:.* family3:.*", that should return them
> all.
>
> I took a quick look.  It should be the case that an empty string returns
> all columns of a row but currently at least, it'll fail on line #75 in
> TableInputFormat:
>
>   if (colArg == null || colArg.length() == 0) {
>
> Try removing the colArg.length().  Maybe it'll work then? (You'll pass in
> an array of columns of zero-length -- I think that'll work).
>
> Meantime, open a JIRA Eric.  Seems like a basic expectation, that there be
> a way to get all columns in an MR.
>
> St.Ack
>
>
> Erik Holstad wrote:
>
>> Hey!
>> Yes I did find that line in HAbstractScanner.java but not really sure
>>  how to use it to do what I want to do.
>>
>> Regards Erik
>>
>> On Sun, Oct 19, 2008 at 7:43 AM, Jean-Daniel Cryans <jdcryans@apache.org
>> >wrote:
>>
>>
>>
>>> I think you are looking for this :
>>>
>>> // Pattern to determine if a column key is a regex
>>>  static Pattern isRegexPattern =
>>>   Pattern.compile("^.*[\\\\+|^&*$\\[\\]\\}{)(]+.*$");
>>>
>>> J-D
>>>
>>> On Fri, Oct 17, 2008 at 9:39 PM, Erik Holstad <erikholstad@gmail.com>
>>> wrote:
>>>
>>>
>>>
>>>> Hi!
>>>> I'm trying to figure out how to get all the columns in a Map-Reduce job
>>>> without having to specify
>>>> them all?
>>>>
>>>> Found the line:
>>>> @see org.apache.hadoop.hbase.regionserver.HAbstractScanner for column
>>>>
>>>>
>>> name
>>>
>>>
>>>>  *      wildcards
>>>>
>>>> in TableInputFormat.java but didn't find any help over in the
>>>> HAbScanner.
>>>>
>>>> Regards Erik
>>>>
>>>>
>>>>
>>>
>>
>>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message