Hi there,


I am working with James Heather on this - does anyone have any pointers?


Many thanks,

Tom



From: Heather, James (ELS) <james.heather@elsevier.com>
Sent: 03 August 2016 15:28
To: user@phoenix.apache.org
Subject: Advice on Phoenix config
 
Hi,

We've got a quite wide table (maybe 50 cols) with about 1 billion rows in it, currently stored in MySQL; we're looking at moving it into Phoenix. The pk there is an autoincrement column, but each row also contains a UUID, and that would probably naturally become the pk in Phoenix. There are several other tables that hang off this table, in the sense that the pk for the main table is a foreign key in these other tables. There are several indexed columns in MySQL that would also need to carry over as indexes in Phoenix.

Most of the queries are reads, but maybe 20% of them are writes. Almost all of them are small, doing point lookups or returning a few rows based on one of the indexes.

Can anyone suggest sensible Phoenix/HBase config to get decent performance out of this? Specifically:

  1. How should we encode the UUID? As BINARY(16)? And if this is the PK, and they are randomly generated UUIDs, presumably salting is unnecessary?
  2. How many nodes should we expect to need to give us at least as good performance as our MySQL database with 1 billion rows?
  3. How many regions?
  4. Presumably this will start to out-perform MySQL as the number of rows in the database increases? When we've got 10 billion rows, MySQL might struggle but hopefully Phoenix will be fine?
  5. Are there any particular HBase configs we should be aware of (RPC timeouts etc.) that we'll need to tweak to get decent performance? This applies partly to the bulk loading process (data migration) at the beginning, but also afterwards when it's released into production.

We'd be extremely grateful for any tips.

James


Elsevier Limited. Registered Office: The Boulevard, Langford Lane, Kidlington, Oxford, OX5 1GB, United Kingdom, Registration No. 1982084, Registered in England and Wales.



Elsevier Limited. Registered Office: The Boulevard, Langford Lane, Kidlington, Oxford, OX5 1GB, United Kingdom, Registration No. 1982084, Registered in England and Wales.