hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Huafeng Wang (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HADOOP-13633) Introduce Apache Kafka as a Service into Hadoop
Date Fri, 14 Oct 2016 08:20:20 GMT

     [ https://issues.apache.org/jira/browse/HADOOP-13633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Huafeng Wang updated HADOOP-13633:
    Attachment: IntroduceApacheKafkaasaServiceinHadoop.pdf

Here is the draft design document. 
Great thanks to [~zhz], [~drankye], [~rakeshr], [~umamaheswararao], [~hayabusa], [~zhouwei]
for the co-work on this design.
Any advice or comment on the design is appreciated.

> Introduce Apache Kafka as a Service into Hadoop
> -----------------------------------------------
>                 Key: HADOOP-13633
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13633
>             Project: Hadoop Common
>          Issue Type: New Feature
>            Reporter: Huafeng Wang
>            Assignee: Huafeng Wang
>         Attachments: IntroduceApacheKafkaasaServiceinHadoop.pdf
> In HDFS-7343 we want to develop a comprehensive storage management solution originated
from community discussions, in order for allowing convenient, intelligent and effective utilization
of various HDFS facilities such as erasure coding, HDFS cache, HSM offering, and etc. based
on valuable insights from events and data collected from namenodes, datanodes, frameworks
and applications via a pub-sub messaging system. In HDFS-8940 it was discussed that the proposed
large scale inotify feature would be better to be implemented via Kafka system to allowing
thousands of consumers or inotify clients.
> Apache Kafka is a distributed messaging system that aims to provide a unified, high-throughput,
low-latency platform for handling real-time data feeds, and currently it’s widely used in
real-time streaming process field. Considering the above two important use cases desired in
Hadoop, we’d like to propose to introduce Kafka as a fundamental event pub-sub service into
Hadoop platform. Like FileSystem offering, we’d like to provide MessagingSystem in Hadoop
style and conforming Hadoop security, backed by an internal or external existing Kafka cluster.
Generally the new service is very convenient to use, and can be used to distribute and exchange
various types of events across IO, storage, and computation that produced by Hadoop itself,
frameworks or applications on top of it. Then on this basis valuable events can be analyzed
in a centralized way so that meaningful applications and usages can be developed.
> The design document is under-going and will be submitted in a week. Feedback are very
welcome. Thanks!

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

View raw message