hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Brian Jeltema <brian.jelt...@digitalenvoy.net>
Subject Re: Column families
Date Thu, 22 Jun 2017 12:52:19 GMT
One use-case that applies to my tables is that I have a table with a set of
columns that have data that is always processed with MR jobs, but other rather large columns
that are generally only accessed through a UI. By separating those into two
column families, MR jobs that do a full table scan on the MR column family
run more efficiently because the scans don’t read data from the ‘ui’ column family.

> On Jun 22, 2017, at 8:44 AM, Alexander Ilyin <alexander@weborama.com> wrote:
> Hi,
> A general question regarding column families. It is said in the doc that
> HBase doesn't do well with more than 2-3 column families because flushing
> and compactions are done on a per region basis which should be addressed in
> the future: http://hbase.apache.org/book.html#number.of.cfs
> Is it still the case in new versions of HBase or there were some
> improvements on this?
> I also don't understand why using several column families might be useful
> even if data access is column scoped. Why can't we just create several
> tables instead? Row key is stored with every cell anyway and it's possible
> to filter by column when querying.
> In general, I don't see when it might make sense to have more than one
> column family in a table with current limitations.
> Thanks in advance.

View raw message