For the usage example that you provided when you write data how does the values of id_1, id_2 and other_key vary? 
I assume id_1 and id_2 remain the same while other_key is monotonically increasing, and thats why the table is salted.
If you create the salt bucket only on id_2 then wouldn't you run into region server hotspotting during writes?

On Thu, Sep 13, 2018 at 8:02 PM, Jaanai Zhang <cloud.poster@gmail.com> wrote:
Sorry, I don't understander your purpose. According to your proposal, it seems that can't achieve.  You need a hash partition, However,  Some things need to clarify that HBase is a range partition engine and the salt buckets were used to avoid hotspot, in other words, HBase as a storage engine can't support hash partition.

----------------------------------------
   Jaanai Zhang
   Best regards!



Gerald Sangudi <gsangudi@23andme.com> 于2018年9月13日周四 下午11:32写道:
Hi folks,

Any thoughts or feedback on this?

Thanks,
Gerald

On Mon, Sep 10, 2018 at 1:56 PM, Gerald Sangudi <gsangudi@23andme.com> wrote:
Hello folks,

We have a requirement for salting based on partial, rather than full, rowkeys. My colleague Mike Polcari has identified the requirement and proposed an approach.

I found an already-open JIRA ticket for the same issue: https://issues.apache.org/jira/browse/PHOENIX-4757. I can provide more details from the proposal.

The JIRA proposes a syntax of SALT_BUCKETS(col, ...) = N, whereas Mike proposes SALT_COLUMN=col or SALT_COLUMNS=col, ... .

The benefit at issue is that users gain more control over partitioning, and this can be used to push some additional aggregations and hash joins down to region servers.

I would appreciate any go-ahead / thoughts / guidance / objections / feedback. I'd like to be sure that the concept at least is not objectionable. We would like to work on this and submit a patch down the road. I'll also add a note to the JIRA ticket.

Thanks,
Gerald