chukwa-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eric Yang <>
Subject Re: Pig scripts for Chukwa 0.5
Date Wed, 06 Jul 2011 15:42:55 GMT
Hi Preetam,

ClusterSummary.pig is the only one that works with HBase.  Other pig
scripts are designed to work on sequence files for Chukwa 0.4.  The
scripts are thrown together at crunch time.  There is no
documentation.  The ChukwaLoader/ChukwaStore function needs to be
revised to use HBase to bring the scripts up to date with Chukwa 0.5.
Hadoop_*.pig scripts are for down sampling of data from raw resolution
into specified time resolution, i.e. 30 minutes average, or 180
minutes average.  UserDailySummary.pig is design to aggregate data
from JobHistory log files to generate a user usage report.  However,
this was designed to work on JobHistory file for Hadoop 0.18.  I don't
think it works with Hadoop 0.20+ because JobHistory format changed in
Hadoop 0.20.


On Wed, Jul 6, 2011 at 5:04 AM, Preetam Patil <> wrote:
> Hi,
> I notice that there are a bunch of Pig scripts in scripts/pig directory,
> only ClusterSummary.pig seems
> to be mentioned in the documentation. The other scripts also seem to be
> based on a storage model
> other than HBase, but provide more info (e.g., per-job/task stats) than that
> stored in HBase.
> Are these compatible with 0.5, and if not, what needs to be done to get
> them working and
> where can I find any API info for them?
> Thanks,
> -preetam

View raw message