phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Konstantinos Kougios <kostas.koug...@googlemail.com>
Subject does phoenix+hbase work for tables larger than a few GB?
Date Wed, 30 Sep 2015 19:10:21 GMT
Hi all,

I had various issues with big tables while experimenting the couple last 
weeks.

The thing that goes to my mind is that hbase (+phoenix) works only when 
there is a fairly powerful cluster and say 1/2 the data can fit into the 
combined servers memory and disks are fast (SSD?) as well. It doesn't 
seem to be able to work when tables are 2x as large as the memory 
allocated to region servers (frankly I think it is less)

Things that constantly fail:

- non-trivial queries on large tables (with group by, counts, joins) 
with region server out of memory errors or crashes without any reason 
for Xmx of 4G or 8G
- index creation on the same big tables. Those always fail I think 
around the point when hbase has to flush it's memory regions to the disk 
and couldn't find a solution
- spark jobs fail unless they are throttled to feed hbase with the data 
it can take . No backpressure?

There were no replies to my emails regarding the issues, which makes me 
think there aren't solutions (or solutions are pretty hard to find and 
not many ppl know them).

So after 21 tweaks to the default config, I am still not able to operate 
it as a normal database.

Should I start believing my config is all wrong or that hbase+phoenix is 
only working if there is a sufficiently powerful cluster to handle the data?

I believe it is a great project and the functionality is really useful. 
What's lacking is 3 sample configs for 3 different strength clusters.

Thanks

Mime
View raw message