hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Segel <michael_se...@hotmail.com>
Subject Re: What is the best hbase table schema for following json data?
Date Thu, 30 May 2013 19:09:51 GMT
But you should be able to write a custom column filter that handles JSON records within a cell.


On May 30, 2013, at 11:48 AM, Ted Yu <yuzhihong@gmail.com> wrote:

> bq. Still these ColumnPrefixFilter will work in this case?
> 
> Probably not. Can you group the subset of keys at the beginning of the
> column (assuming the subset of keys is known and doesn't change) ?
> 
> bq. I am storing each click(set of key value pairs) in one cell say
> "clicks:event1". Is this OK?
> 
> This should be Okay.
> 
> On Wed, May 29, 2013 at 11:13 PM, AnilKumar B <akumarb2010@gmail.com> wrote:
> 
>> Hi Ted,
>> 
>> @You can utilize MultipleColumnPrefixFilter or ColumnPrefixFilter to speed
>> up scan.
>> [Anil] Thanks for the info. But I am storing all the key value pairs
>> corresponding to one click in one column. Still these ColumnPrefixFilter
>> will work in this case?
>> 
>> @How many key / value pairs does each 'click' have ?
>> [Anil] number of key value pairs are not fixed. It can vary from 20-200
>> 
>> @Among these pairs, are you going to search for a subset of keys ?
>> [Anil] Yes.
>> 
>> 
>> 
>> In my schema, I am storing each click(set of key value pairs) in one cell
>> say "clicks:event1". Is this OK? or do I need to change schema design in
>> such a way that each key-value pair as one column? What is the better way
>> to store Json data?
>> 
>> 
>> Thanks,
>> B Anil Kumar.
>> 
>> 
>> On Thu, May 30, 2013 at 9:42 AM, Ted Yu <yuzhihong@gmail.com> wrote:
>> 
>>> bq. 1) Suppose If I want search on key of click, It will be full scan
>>> 
>>> You can utilize MultipleColumnPrefixFilter or ColumnPrefixFilter to speed
>>> up scan.
>>> 
>>> How many key / value pairs does each 'click' have ? Among these pairs,
>> are
>>> you going to search for a subset of keys ?
>>> 
>>> Cheers
>>> 
>>> On Wed, May 29, 2013 at 8:47 PM, AnilKumar B <akumarb2010@gmail.com>
>>> wrote:
>>> 
>>>> Hi,
>>>> 
>>>> What is the best hbase table schema for following json data?
>>>> I need to store following JSON data in hbase.
>>>> {"Session"":{"Header" :
>>>> {"key1":"value1","key2":"value2","key3":"value3","key4":"value4",....},
>>>> "clicks" : [{"click" " : {"key1":"value1","key2":"value2",
>>>> "key3":"value3"....}, {"click" : {"key1":"value1", "key2":"value2",
>>>> ....}}]}}
>>>> 
>>>> I have created the schema as below, but there seems to some issues.
>>>> rowkey -> compositeKey of session fields
>>>> ColumnFamily 1 -> "Header" which consists of following columns
>>>> 1) Header:HeaderFields which stores  "{"Header" :
>>>> {"key1":"value1","key1":"value1","key1":"value1","key1":"value1",....}"
>>> in
>>>> one cell
>>>> 2) other columns
>>>> 
>>>> ColumnFamily 2 -> "clicks" and each "click" will be one column
>>>> 
>>>> The problem here is
>>>> 1) Suppose If I want search on key of click, It will be full scan, how
>>> can
>>>> I optimize my schema for such search requirement?
>>>> 2) If I want to provide some secondary index for keys of clicks, how
>> can
>>>> Implement it?
>>>> 
>>>> Thanks,
>>>> B Anil Kumar.
>>>> 
>>> 
>> 


Mime
View raw message