chukwa-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eric Yang <>
Subject Improvement for Chukwa Agent and Collector
Date Mon, 09 Aug 2010 17:01:48 GMT
Hi all,

 Chukwa Agent has a custom command protocol (port 9093).  The current
protocol is not easy to modify to implement security related features such
as authentication and authorization.  I would like to propose that we use
web service REST like protocol to improve security and be more aligned with
web standards.  Let¹s go through the use cases of Chukwa Agent command

Start an adaptor:

Current command: Add
/tmp/chukwa/var/log/metrics/chukwa-hdfs-jvm-1271121726962.log 0

POST /tmp/chukwa/var/log/metrics/chukwa-hdfs-jvm-1271121726962.log HTTP/1.0
Accept: chukwa/UTF8NewLineEscaped (optional)
Offset: 0 (optional)
Content-Type: application/json
{ ³RecordType² : ³jvm², "Cluster": "demo", other tags ... }

List adaptors:

Current command: List

GET / HTTP/1.0
Accept: text/html
Get list of information about all streaming adatpors

HEAD /tmp/chukwa/var/log/metrics/chukwa-hdfs-jvm-1271121726962.log HTTP/1.0
HEAD /adaptor_fcb0fe44e9dd6d2283962cb0e3b4ea0f HTTP/1.0
Get information about the streaming adaptor only.

Stop adaptors:

Current command: Stop adaptor_fcb0fe44e9dd6d2283962cb0e3b4ea0f

DELETE /tmp/chukwa/var/log/metrics/chukwa-hdfs-jvm-1271121726962.log
HTTP/1.0 or
DELETE /adaptor_fcb0fe44e9dd6d2283962cb0e3b4ea0f HTTP/1.0
Delete the adaptor

Current command: Help

GET /help HTTP/1.0
Accept: text/html

With this modification, we can support encryption and Basic/Digest
Authentication from existing libraries without reinvent the wheel.  If the
community is ok with this change, I would like to propose the next

Chukwa Agent and collectors are two different feature sets, but there
shouldn¹t be any road block to build a switch to toggle the machine to serve
different responsibilities.  For example, a chukwa agent machine can flip a
switch to join collector pool and continue to stream data from itself.  With
this improvement, it is more easily to dynamically create bigger data
collection pipeline on the fly.  Both system use the same communication
protocol, hence it is easier to manage.  In the future, we can add addition
commands like TRACE /config/reload to reload configuration, and tap into
ZooKeeper for managing data flow in centralized configuration management.

Any thoughts?


View raw message