Are you looking toward replication to support DR? If so, you can rely on HBase-level replication with a few gotchas and some operational hurdles:
- When upgrading Phoenix versions, upgrade the server-side first for both the primary and secondary cluster. You can do a rolling upgrade and old clients will continue to work with the upgraded server, so no downtime is required (see Backward Compatibility for more details).
- Execute Phoenix DDL (i.e. user-level changes to existing Phoenix tables, creation of new tables, indexes, sequences) against both the primary and secondary cluster with replication suspended (as otherwise you end up with a race condition for the replication of the SYSTEM.CATALOG table and any not yet existing tables). If you've upgraded Phoenix, then even if there's no DDL, you should at a minimum connect a Phoenix client to both the primary and secondary cluster to trigger any upgrades to Phoenix system tables. Once the DDL is complete, resume replication.
- Do not replicate the SYSTEM.SEQUENCE table since replication is asynchronous and may fall behind which would be a big issue if switching over to the secondary cluster as sequence values could start repeating. Instead, incorporate a cluster ID into any sequence-based identifiers and concatenate this with the sequence value. In that way, the identifiers will continue to be unique after a DR event.
- Replicate Phoenix indexes just like data tables as the HBase-level replication of the data table will not trigger index updates.
- In theory, you really only need to replicate views from SYSTEM.CATALOG since you're executing DDL on both the primary and secondary cluster, however I don't think HBase has that capability (but it sure would be nice). FWIW, we're thinking of separating views from table definitions into separate Phoenix tables but need to first make these tables transactional (we're using an HBase mechanism that allows all or none commits to the SYSTEM.CATALOG, but it only works if all updates are to the same RS which is too limiting).
- It's a good idea to monitor the depth of the replication queue so you know if/when replication is falling behind.
- Care has to be taken wrt keeping deleted cells on both clusters if you want to support point-in-time backup and restore, as it's possible that compaction would remove cells before you're backup window has passed (this orthogonal to replication, but just wanted to bring it up).
- Given the asynchronous nature of HBase replication, there's no good way of knowing the transaction ID (i.e. timestamp) at which you have all of the data. Also, replication of the state that is kept by the transaction manager in terms of inflight and invalid transactions is left as an exercise to the reader. :-) In short - there's still some work to do wrt the combination of transactions and replication (but it'd be really interesting work if anyone is interested).