hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stephen Boesch <java...@gmail.com>
Subject Re: Nested data structures examples for HBase
Date Mon, 08 Sep 2014 23:50:47 GMT
Thanks Demai. That HBasec conf 2012 presentation leaned pretty heavily on
the nested data as a common HBase design pattern.  Nothing magic there: but
good to confirm it as the accepted norm.  We will proceed with our
considerations given the approach from that talk.

There was one moderately interesting point in the talk: to create duplicate
copies of nested data structures into separate column families according to
different sorting orders from the application query patterns. . The cost of
course is more disk space  - but the faster access time is more important
than the additional disk usage.

2014-09-08 15:59 GMT-07:00 Demai Ni <nidmgg@gmail.com>:

> there was a schema design talk in HBase conf 2012 (
> http://www.slideshare.net/cloudera/5-h-base-schemahbasecon2012). There was
> a video but the link from here: http://hbasecon.com/archive.html, but it
> is
> broken
>
> Anyway, if I remember correctly. the idea is to use column(aka
> columnfamily) as one layer, and qualifier as another layer, and play some
> tricks around it. You may like to take a look to see whether it fits your
> needs. The layers has to be simple, or an alternative is to save a json
> file(or other structured file format as the raw value)
>
> Demai
>
> On Mon, Sep 8, 2014 at 2:06 PM, Stephen Boesch <javadba@gmail.com> wrote:
>
> > While I am aware that HBase does not have native support for nested
> > structures, surely there are some of you that have thought through this
> use
> > case carefully.
> >
> > Our particular use case is likely having single digit nested layers with
> > tens to hundreds of items in the lists at each level.
> >
> > An example would be a
> >
> >  top Level  300 items
> >  middle level :  1 to 100 items  ("1 value"  may indicate a single value
> as
> > opposed to a list)
> >  third level:  1 to 50 items
> >  fourth level  1 to 20 items
> >
> > The column names are likely known ahead of time- which may or may not
> > matter for hbase.  We could model the above structure in a Parquet File
> or
> > in Hive (with nested struct's)- but we would like to consider whether
> > HBase.might also be an option.
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message