helix-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kanak Biscuitwala (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HELIX-345) Speed up the controller pipelines
Date Tue, 14 Jan 2014 19:37:58 GMT

    [ https://issues.apache.org/jira/browse/HELIX-345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13871015#comment-13871015

Kanak Biscuitwala commented on HELIX-345:

We came up with 4 approaches to compare:

1. The existing method of re-reading the entire cluster
2. Caching messages, purging them if names change
3. Cache all properties, updating them if the stats associated with the property changes
4. Use message callbacks to mark paths as dirty, and only read those paths

I set up a sample cluster with 100 nodes, 100 resources, 100 partitions per resource, and
3 replicas per partition.

1. The unchanged read stage can read the cluster in 2.5-5 s, with an average around 3.5-4
2. Caching messages means that one pipeline read will take 4-5 s, but subsequent reads will
take about 0.5s
3. What I found here was that this stage was no faster than the baseline. The crux of the
problem is that reading stats is not a free operation. On average this would take 2-2.5 s.
Combined with reading the changed properties and the added CPU time to manage the cache, the
savings is minimal if it exists at all.
4. This approach suffers from the following problem: if the messages of a node change, the
current state must also be read. However, it also has the benefit of only impacting reads
at a single-node scope. What I realized when I was testing for correctness here is that this
single-node scope causes problems. Imagine if we send transition messages, leading to message
callbacks. These messages are to multiple nodes, but we can only invalidate the cache for
a single node at a time. So if we queue the callbacks as we currently are, the controller
generates duplicate messages. If we run the pipeline in a new thread, which we will, the result
of this depends on who wins on a race condition. There are a lot of safety considerations,
which ultimately leads to re-reading a lot of znodes.

Given this exploration, option 2 seems to be the safest and best-performing strategy.

> Speed up the controller pipelines
> ---------------------------------
>                 Key: HELIX-345
>                 URL: https://issues.apache.org/jira/browse/HELIX-345
>             Project: Apache Helix
>          Issue Type: Bug
>    Affects Versions: 0.6.2-incubating, 0.7.0-incubating
>            Reporter: Kanak Biscuitwala
>            Assignee: Kanak Biscuitwala
> ReadClusterDataStage can take some time. We should have techniques for speeding it up
like parallelizing or caching.

This message was sent by Atlassian JIRA

View raw message