hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jean-Marc Spaggiari <jean-m...@spaggiari.org>
Subject Re: multiple data versions vs. multiple rows?
Date Mon, 19 Jan 2015 19:37:47 GMT
Hi Yong,

A row will not split between 2 regions. If you plan having thousands of
versions, based on the size of your data, you might end up having a row
bigger than your preferred region size.

If you plan just keep few versions of the history to have a look at it, I
will say go with it. If you plan to have one million version because you
want to keep all the events history, go with the row approach.

You can also consider going with the Column Qualifier approach. This has
the same constraint as the versions regarding the split in 2 regions, but
it might me easier to manage and still give you the consistency of being
within a row.


2015-01-19 14:28 GMT-05:00 yonghu <yongyong313@gmail.com>:

> Dear all,
> I want to record the user history data. I know there exists two options,
> one is to store user events in a single row with multiple data versions and
> the other one is to use multiple rows. I wonder which one is better for
> performance?
> Thanks!
> Yong

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message