hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Konstantin Shvachko (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-10641) Introduce Coordination Engine interface
Date Mon, 25 Aug 2014 19:29:01 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-10641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14109596#comment-14109596
] 

Konstantin Shvachko commented on HADOOP-10641:
----------------------------------------------

Thanks Henry and Andrey for the benchmark results. To summarize this, we have three benchmarks
# NNThroughputBenchmark, which gives us the upper bound of NN throughput.
# ZK benchmark, which measures the performance of ZK itself.
# ZK-CE benchmark measuring performance of CE based on ZK.

Ideally we would like to see NNThroughput <= ZK-CE throughput <= ZK throughput.
Just looking at create operation 
- NNThroughput yields 13K ops/sec with 400 threads, which seems to be optimal for that hardware
configuration.
- ZK throughput is substantially higher on SSD: 34K ops/sec
- ZK-CE runs at 8.4K ops/sec, which is slower than NNThroughput.

So there is work to do here. I think CE implementation can be optimized to get on par with
or close to ZK performance.

> Introduce Coordination Engine interface
> ---------------------------------------
>
>                 Key: HADOOP-10641
>                 URL: https://issues.apache.org/jira/browse/HADOOP-10641
>             Project: Hadoop Common
>          Issue Type: New Feature
>    Affects Versions: 3.0.0
>            Reporter: Konstantin Shvachko
>            Assignee: Plamen Jeliazkov
>         Attachments: HADOOP-10641.patch, HADOOP-10641.patch, HADOOP-10641.patch, HADOOP-10641.patch,
HADOOP-10641.patch, NNThroughputBenchmark Results.pdf, ce-tla.zip, hadoop-coordination.patch,
zkCEBenchmark.pdf, zkbench.pdf
>
>
> Coordination Engine (CE) is a system, which allows to agree on a sequence of events in
a distributed system. In order to be reliable CE should be distributed by itself.
> Coordination Engine can be based on different algorithms (paxos, raft, 2PC, zab) and
have different implementations, depending on use cases, reliability, availability, and performance
requirements.
> CE should have a common API, so that it could serve as a pluggable component in different
projects. The immediate beneficiaries are HDFS (HDFS-6469) and HBase (HBASE-10909).
> First implementation is proposed to be based on ZooKeeper.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message