chukwa-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eric Yang <>
Subject Re: What constitute a successful project?
Date Fri, 30 Nov 2012 06:39:31 GMT
Hi Jason,

IBM is using Chukwa agent as the base of monitoring component for
BigInsights.  The monitoring system share the same design principal, but
has been custom built for BigInsights.  We wrote some generic adaptors to
collect data from SNMP, JMX, and REST, which we are currently seeking
approval from IBM to contribute back to open source.   BigInsights is IBM's
distribution of Apache Hadoop.  We use it to monitor Hadoop and related
technologies, and Chukwa is reliable and works well for us.

Being able to have raw time series metrics and logs correlate events
together.  Chukwa approach is definitely better than plain Ganglia and
Nagios.  In Nagios and Ganglia combination, you only get facts after
irreversible events have happened.  Such as jobtracker stop responding, or
HBase region server died.  With raw data collected and analyzed, we can
prevent irreversible events from happening.  For example, problematic job
can be terminated before the job grow out of control.

Netflix has a number of presentation talking about how they use Chukwa to
stream data to EC2.  The most recent presentation is here:


On Thu, Nov 29, 2012 at 5:54 AM, Dai, Jason <> wrote:

> Eric and the team,
> First, let me provide a little background about us. We at Intel have been
> using Chukwa for building HiTune (a Hadoop performance analyzer
>, and one of our key team member,
> Jie Huang, was recently accepted as a Chukwa committer (unfortunately she
> was out sick since late September and has not been as active in the Chukwa
> community as we would like).
> IMO, a key question for the Chukwa project is on how to grow the
> community, and I believe an active developer community is driven by active
> users.  It is unclear to me at this moment who are using Chukwa in their
> daily work, what it is being used for, and how it can play an important
> role in its target domain. I would suggest people on the list to share
> their usage as the first step - How are you using Chukwa? Do you think
> Chukwa is a good solution that can attract new users for that specific
> problem?
> As a starter, I'll share our usage:
> 1)      We have been using Chukwa to collect and aggregate performance
> metric from Hadoop cluster, so that our tool HiTune can analyze performance
> of Hadoop applications.
> 2)      And as we outlined in CHUKWA-665, we have a prototype that uses
> Chukwa to collect and aggregate cluster system metrics, which powers the
> Ganglia web frontend for cluster monitoring.
> IMHO, at this moment Flume is winning mindshare for distributed data
> collection (e.g., ETL), and Ganglia & Nagios are the cluster monitoring of
> choice; I wonder what your takes are on how Chukwa can differentiate in
> these domains, or maybe there are some other domains Chukwa is good at.
> Thanks,
> -Jason
> -----Original Message-----
> From: Eric Yang []<mailto:[mailto:
> Sent: Mon, 26 Nov 2012 03:33:12 GMT
> Subject: What constitute a successful project?
> Hi IPMC,
> For the past two years, Chukwa has been labelled as non-active project by
> mentors, and has been put on votes for retiring this project by mentor and
> In this year's stats, Chukwa has more activities in comparison to Apache
> Wink in both mailing list traffic and resolved jiras.  Yet Chukwa has been
> voted to discontinue by mentors, but Wink is voted to graduate  by the same
> mentor. Here are the number of mails showed up in dev list between Apache
> Chukwa and Apache Wink:
> ...

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message