chukwa-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eric Yang <>
Subject Re: extending demux
Date Tue, 28 Dec 2010 08:58:07 GMT
Hi Ari,

Demux framework has been modified to operate in two modes.  First, map
reduce mode is fully backward compatible with Chukwa 0.4 demux.
Second, Chukwa collector uses HBaseWriter, which implements it's own
OutputCollector and invokes demux parsers.  This makes it easy to
write one parser which work in both modes.

Take a look of org.apache.hadoop.chukwa.extraction.demux.processor.mapper.SystemMetrics.
 All demux parsers extends AbstractProcessor class, and implement
parse function.  The input of parse function is basically Chukwa
chunks in string, output collector and reporter class.

A special function called:

buildGenericRecord(ChukwaRecord record, String body, long timestamp,
String reduceType);

ChukwaRecord is basically a HashMap, and it is grouped by reduceType,
timestamp, and primary key (i.e. csource).  In the HBase mode,
reduceType maps to columnFamily name.  Timestamp + Primary key is
mapped to Row Key in HBase.  The table name is defined by annotation
at beginning of the class.  HBaseWriter's OutputCollector takes the
output spill out by the parse function, and put the records into

In Summary, to develop a demux processor:

1. Extend AbstractProcessor
2. Annotate table name
3. Implement parse function
4. Configure chukwa-demux-conf.xml to map data type to the new Parser
5. Create hbase schema
6. Restart collector with the new jar and watch data flow and show up in HICC


On Mon, Dec 27, 2010 at 6:12 PM, Ariel Rabkin <> wrote:
> Howdy.
> I'm gearing up to make use of the new Demux framework. I have several
> site-specific metrics that I want to use Chukwa to collect and graph.
> I'm a little vague about how to do this.  I think I see what the HBase
> metric creation needs to be. But what do I need to do in the way of
> Demux processors?
> What input format does HICC expect / what's the output format supposed
> to be?  Which are the right examples for me to look at? Is anything
> documented yet? Who has done this already?
> -Ari
> --
> Ari Rabkin
> UC Berkeley Computer Science Department

View raw message