phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Riesland, Zack" <>
Subject Guidance to improve upsert performance
Date Wed, 03 Aug 2016 19:45:38 GMT

I'm working on a POC to use HBase + Phoenix as a DB layer for a system that consumes several
thousand (10,000 to 40,000) messages per second.

Our cluster is fairly small: 4 region servers supporting about a dozen tables. We are currently
experimenting with salting - our first pass was 4 regions.

The ultimate data size is also pretty small. The data compacts very nicely and after aggregation
and de-duplication, it is only on the order of 10's of GB.

Querying these tables is reasonably performant right now, but upserting the data is not optimal
and I'm looking for some performance tips.

As I said, the incoming data is streamed (via storm), at a rate of thousands of messages per

After some basic benchmarking, it appears that Storm is able to consume the data much more
quickly than it can upsert it to phoenix.

I understand that Phoenix is fundamentally designed for fast querying, and not necessarily
fast writing. But can anyone suggest some Phoenix and/or hbase parameters we should consider
tuning to improve performance? Any tips on designing something like this?

Also, we have 3 additional indexes, in addition to the primary key. I'm guessing that this
creates a significant amount of overhead in terms of writing data. But the indexes are necessary
for query performance. Is it possible to force the index maintenance to behave in more of
a batch pattern? Maybe only update the index tables every X minutes? Even twice a day?

Thanks in advance for any tips!

View raw message