chukwa-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eric Yang <>
Subject Re: What's Chukwa for?
Date Fri, 04 Dec 2009 01:58:21 GMT
Hi MJ,

When Chukwa Agent was introduced in the design,  it was designed to be both
configuration management and monitoring agent.  However, the current agent
implementation does not have configuration management capabilities nor limit
developer to add them.  The community is currently focused on monitoring
agent for now.


On 12/3/09 3:42 PM, "MJ Lai" <> wrote:

> Eric, Ari.
> Thanks for your responses.
> I have some extended questions. I'm trying to figure out what is the
> best way to manage a production hadoop cluster, features include:
> - deployment: install hadoop from one console;
> - monitoring
> - configure
> - log/system analytics
> - software upgrade
> - etc.
> It seems Ganglia/Nagio are widely used to monitor hadoop cluster
> metrics, and Chukwa is for log analytics. But still, it is a pain to
> manage/configure a hadoop cluster.
> Since Chukwa has an agent installed in each endpoint, do you have any
> plan to build it as a universal platform for hadoop management?
> Thanks.
> MJ
> Eric Yang wrote:
>> Chukwa is a generic distributed log processing system.  It's primary use
>> case is to monitor Hadoop cluster.  There are several analytics bundled for
>> displaying system state, java vm resource usage, Hadoop dfs, mapreduce
>> metrics.  However, anyone could add their own analytics system to run in
>> Chukwa.
>> In general, the monitoring system is usually independent from the subject
>> which being monitored.  Chukwa documentation might look like you need two
>> clusters for this to work.  However, it's actually possible for Chukwa to
>> run on the same cluster as it's monitoring.
>> It's better to call chukwa as a reporting system if Chukwa is running on the
>> same cluster.  If hadoop crashed in this type of deployment, chukwa would
>> not be responsible for not alerting.
>> Regards,
>> Eric
>> On 12/2/09 3:30 PM, "MJ Lai" <> wrote:
>>> Hi.
>>> It is another ``what for'' question.
>>> I went thru the chukwa web site and am still kind of confused by what is
>>> software really for. Can I say the major purpose is to provide 1) a
>>> generic distributed log processing system, or 2) this log system is only
>>> for Hadoop cluster? In case of 1), why do we want to make it tightly
>>> bound to Hadoop?
>>> Assume we have a 100-machine cluster (no hadoop), if I deploy Chukwa to
>>> process the cluster logs, I still need to create another hadoop cluster
>>> to make it work?
>>> I think some practical use cases could reduce the confusions of this
>>> this project.
>>> Thanks.
>>> MJ

View raw message