chukwa-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Guille -bisho- (JIRA)" <>
Subject [jira] Commented: (CHUKWA-462) Store the cluster in the key for performance and easier customization on mappers
Date Thu, 11 Mar 2010 16:47:27 GMT


Guille -bisho- commented on CHUKWA-462:

There is still one test failing:
Testcase: testFSMBuilder_JobHistory020(org.apache.hadoop.chukwa.analysis.salsa.fsm.TestFSMBuilder):
Error running FSMBuilder: Job failed!
junit.framework.AssertionFailedError: Error running FSMBuilder: Job failed!
at org.apache.hadoop.chukwa.analysis.salsa.fsm.TestFSMBuilder.testFSMBuilder_JobHistory020(

I don't know why, because the cluster is extracted correctly. I will continue with this on
tuesday, I'm on a travel. If anyone know what could be happening here, please tell me.

> Store the cluster in the key for performance and easier customization on mappers
> --------------------------------------------------------------------------------
>                 Key: CHUKWA-462
>                 URL:
>             Project: Hadoop Chukwa
>          Issue Type: Improvement
>          Components: Data Processors
>            Reporter: Guille -bisho-
>         Attachments: cluster_in_ChukwaRecordKey.v3.diff, cluster_in_ChukwaRecordKey.v4.diff
> Right now the chukwa framework is storing the destination cluster as a tag in the Chunk.
Then the tags are copied to the ChukwaRecord, and before storing it, it's parsed with a regular
expression from each record.
> - It's slow to apply a preg to each record
> - It's harder to modify the destination cluster from the mapper, you have to tweak the
tags field.
> - Takes unneeded space on records storing the cluster on each of them.
> The proposed path:
> - Extracts the cluster from chunk tags just once per chunk, much faster.
> - Stores the cluster in the key, so it's easy to recover.
> - It's easy to tweak from the mapper. Just alter it with key.setClusterName(String clusterName)
> - Strips the cluster from the tags field of the resulting chukwa records. If the tags
field is empty, completely skips setting the tags field in the record.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message