chukwa-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "shreyas subramanya (JIRA)" <>
Subject [jira] [Updated] (CHUKWA-707) Replace Chukwa collector with Apache Kafka
Date Tue, 22 Jul 2014 23:58:38 GMT


shreyas subramanya updated CHUKWA-707:

    Attachment: CHUKWA-707.patch1

I have created the first patch of Kafka integration and uploaded it for review. It currently
makes use of Kafka as a replacement for the in-memory chunk queue in Chukwa. The flow is as
Adaptor -> KafkaQueue -> KafkaBroker
KafkaConnector -> multiple KafkaConsumer threads -> PipelineWriters
(each KafkaConsumer sets up a pipeline)

The following configurations are needed:
  -> chukwaAgent.chunk.queue = org.apache.hadoop.chukwa.datacollection.agent.KafkaQueue
(this sets up the kafka producer)
  -> chukwa.agent.connector = org.apache.hadoop.chukwa.datacollection.connector.kafka.KafkaConnector
(this sets up the kafka consumer)

Each data type will be a new topic on kafka.

I am working on the improving the following areas:
1. Partitioning the topics so that we can have parallelism in a consumer group
2. Making the key format configurable

> Replace Chukwa collector with Apache Kafka
> ------------------------------------------
>                 Key: CHUKWA-707
>                 URL:
>             Project: Chukwa
>          Issue Type: New Feature
>            Reporter: Eric Yang
>            Assignee: shreyas subramanya
>         Attachments: CHUKWA-707.patch1
> Chukwa collector has stopped evolving since 2010.  Newer framework has offer better features
of message queues, and Apache Kafka looks like a good replacement for Chukwa collector.
> Chukwa agent can implement a connector to Apache Kafka to replace Chukwa collector, and
HBase consumer to write data to HBase.  HICC REST API change to new HBase storage format.

This message was sent by Atlassian JIRA

View raw message