chukwa-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bill Graham <>
Subject Re: Improvement for Chukwa Agent and Collector
Date Mon, 09 Aug 2010 18:33:08 GMT
I agree that we should implement the features you suggest. I've been
thinking about a REST API for the agents lately, as I'd also like to be able
to expose statistics to help with monitoring. Something similar to what the
collector does so you can attach monitoring to a URL see if the average data
rate suddenly drops.

Regarding the proposed API protocol, I think we should use POST, GET and
DELETE to create, fetch and remove adaptors, similar to how you propose, but
the identifier in the rest resource should be the adaptor id, not the
filename. This is more RESTful since the adaptor is the thing being
accessed, not the file. Also, you could have more than one adaptor on a
given file and some adaptors (i.e., JMSAdaptor) don't have a file associated
with them.

I propose something like this:

- Add Adaptor:

POST /rest/v1.0/adaptor HTTP/1.0
Accept: text/plain
Content-Type: application/json
{ "RecordType" : "jvm", "Cluster": "demo", adaptor configs including offset,
other tags ... }

Returns: adaptor metadata including id

- Get Adaptor fcb0fe44e9dd6d2283962cb0e3b4ea0f:

GET /rest/v1.0/adaptor/fcb0fe44e9dd6d2283962cb0e3b4ea0f HTTP/1.0

- Remove Adaptor fcb0fe44e9dd6d2283962cb0e3b4ea0f:

DELETE /rest/v1.0/adaptor/fcb0fe44e9dd6d2283962cb0e3b4ea0f HTTP/1.0

- List all adaptors:
GET /rest/v1.0/adaptor HTTP/1.0

- Help
GET /rest/v1.0/help HTTP/1.0

- Statistics for all adaptors
GET /rest/v1.0/adaptorStats HTTP/1.0

- Statistics for a single adaptor
GET /rest/v1.0/adaptorStats/fcb0fe44e9dd6d2283962cb0e3b4ea0f HTTP/1.0



On Mon, Aug 9, 2010 at 10:01 AM, Eric Yang <> wrote:

> Hi all,
>  Chukwa Agent has a custom command protocol (port 9093).  The current
> protocol is not easy to modify to implement security related features such
> as authentication and authorization.  I would like to propose that we use
> web service REST like protocol to improve security and be more aligned with
> web standards.  Let¹s go through the use cases of Chukwa Agent command
> protocol:
> Start an adaptor:
> Current command: Add
> org.apache.hadoop.chukwa.datacollection.adaptor.filetailer.CharFileTailingAd
> aptorUTF8NewLineEscaped
> /tmp/chukwa/var/log/metrics/chukwa-hdfs-jvm-1271121726962.log 0
> Proposed:
> POST /tmp/chukwa/var/log/metrics/chukwa-hdfs-jvm-1271121726962.log HTTP/1.0
> Accept: chukwa/UTF8NewLineEscaped (optional)
> Offset: 0 (optional)
> Content-Type: application/json
> { ³RecordType² : ³jvm², "Cluster": "demo", other tags ... }
> List adaptors:
> Current command: List
> Proposed:
> GET / HTTP/1.0
> Accept: text/html
> Get list of information about all streaming adatpors
> HEAD /tmp/chukwa/var/log/metrics/chukwa-hdfs-jvm-1271121726962.log HTTP/1.0
> or
> HEAD /adaptor_fcb0fe44e9dd6d2283962cb0e3b4ea0f HTTP/1.0
> Get information about the streaming adaptor only.
> Stop adaptors:
> Current command: Stop adaptor_fcb0fe44e9dd6d2283962cb0e3b4ea0f
> Proposed:
> DELETE /tmp/chukwa/var/log/metrics/chukwa-hdfs-jvm-1271121726962.log
> HTTP/1.0 or
> DELETE /adaptor_fcb0fe44e9dd6d2283962cb0e3b4ea0f HTTP/1.0
> Delete the adaptor
> Help:
> Current command: Help
> Proposed:
> GET /help HTTP/1.0
> Accept: text/html
> With this modification, we can support encryption and Basic/Digest
> Authentication from existing libraries without reinvent the wheel.  If the
> community is ok with this change, I would like to propose the next
> improvement:
> Chukwa Agent and collectors are two different feature sets, but there
> shouldn¹t be any road block to build a switch to toggle the machine to
> serve
> different responsibilities.  For example, a chukwa agent machine can flip a
> switch to join collector pool and continue to stream data from itself.
>  With
> this improvement, it is more easily to dynamically create bigger data
> collection pipeline on the fly.  Both system use the same communication
> protocol, hence it is easier to manage.  In the future, we can add addition
> commands like TRACE /config/reload to reload configuration, and tap into
> ZooKeeper for managing data flow in centralized configuration management.
> Any thoughts?
> Regards,
> Eric

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message