chukwa-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ariel Rabkin <>
Subject Re: Agent and collector
Date Fri, 29 Jul 2011 22:54:46 GMT
Yes, the agent-->collector path is HTTP.

This was done precisely to allow load balancers. I don't know how
tested that configuration is, though. I think most sites had Chukwa
itself do the load balancing by specifying multiple collectors.

There is a notion of end-to-end reliability; the so-called
asynchronous ack mechanism. It's off by default and hasn't been tried
much in production. See for
the detailed design of it.


On Fri, Jul 29, 2011 at 11:04 AM, T. A. Smooth <> wrote:
> Hello I am checking out Chukwa. I have a few questions I was hoping the mail
> list could answer :-)
> 1)Does Chukwa agents communicate to collectors over http? Or some other
> protocol?
> The agent configuration makes me believe that:
> 2) And the docs it seems an Agent will pick a collector at random and then
> use that collect until there is a problem in communicating with it. How do
> you think the agent/collector would act if they have a load balancer between
> them? For example, the agent configuration would have just one url
> http://collector-loadbalancer.
> The load balancer would have 1 or more collectors behind it saving the
> chunks it receives to disk or hadoop.
> 3) Does chukwa have any “end-to-end” reliability features for message
> delivery? For example, a collector may receive the chunk from the agent but
> it may have a problem writing it to the data store. (ie. Disk space full,
> connection to hadoop down) . Will the agent be notified that the chunk was
> not processed for a certain reason and the agent is told to cache to disk
> the missed message?
> Thanks for the info!
> -tp-

Ari Rabkin
UC Berkeley Computer Science Department

View raw message