hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sergey Shelukhin (JIRA)" <>
Subject [jira] [Updated] (HIVE-14589) add consistent node replacement to LLAP for splits
Date Wed, 31 Aug 2016 22:25:20 GMT


Sergey Shelukhin updated HIVE-14589:
    Attachment: HIVE-14589.03.patch

Adding the tests (3 from curator pretty much, 3 new) and addressing RB feedback.

> add consistent node replacement to LLAP for splits
> --------------------------------------------------
>                 Key: HIVE-14589
>                 URL:
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Sergey Shelukhin
>            Assignee: Sergey Shelukhin
>         Attachments: HIVE-14589.01.patch, HIVE-14589.02.patch, HIVE-14589.03.patch, HIVE-14589.patch
> See HIVE-14574. (copied from the comment below) This basically creates the nodes in ZK
for "slots" in the cluster. The LLAPs try to take the lowest available slot, starting from
0. Unlike worker-... nodes, the slots are reused, which is the intent. The LLAPs are always
sorted by the slot number for splits.
> The idea is that as long as LLAP is running, it will retain the same position in the
ordering, regardless of other LLAPs restarting, without knowing about each other, the predecessors
location (if restarted in a different place), or the total size of the cluster.
> The restarting LLAPs may not take the same positions as their predecessors (i.e. if two
LLAPs restart they can swap slots) but it shouldn't matter because they have lost their cache
> I.e. if you have LLAPs with slots 1-2-3-4 and I nuke and restart 1, 2, and 4, they will
take whatever slots, but 3 will stay the 3rd and retain cache locality.
> This also handles size increase, as new LLAPs will always be added to the end of the
sequence, which is what consistent hashing needs.
> One case it doesn't handle is permanent cluster size reduction. There will be a permanent
gap if LLAPs are removed that have the slots in the middle; until some are restarted, it will
result in misses

This message was sent by Atlassian JIRA

View raw message