chukwa-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eric Yang (JIRA)" <>
Subject [jira] Commented: (CHUKWA-488) Hadoop cannot find custom Demux class
Date Thu, 20 May 2010 19:11:16 GMT


Eric Yang commented on CHUKWA-488:

+1 Looks good, and works on my test environment.

> Hadoop cannot find custom Demux class
> -------------------------------------
>                 Key: CHUKWA-488
>                 URL:
>             Project: Hadoop Chukwa
>          Issue Type: Bug
>          Components: MR Data Processors
>    Affects Versions: 0.4.0
>         Environment: Linux x86-64
> Java 1.6.0_20
>            Reporter: Kirk True
>         Attachments: Demux.diff
> I'm getting ClassNotFoundException errors when running inside Hadoop's map phase, unable
to find my class org.apache.hadoop.chukwa.extraction.demux.processor.mapper.XmlBasedDemux
which I've packaged in a JAR named data-collection-demux-0.1.jar.
> The problem seems to be in the values of these two properties in the Hadoop job configuration:
> {code}
> <property>
>     <name>mapred.job.classpath.files</name>
>     <value>hdfs://localhost:9000/chukwa/demux/data-collection-demux-0.1.jar</value>
> </property>
> <property>
>     <name>mapred.cache.files</name>
>     <value>hdfs://localhost:9000/chukwa/demux/data-collection-demux-0.1.jar</value>
> </property>
> {code}
> The problem seems to stem from the fact that the call to DistributedCache.addFileToClassPath
is passing in a Path that is in URI form, i.e. hdfs://localhost:9000/chukwa/demux/data-collection-demux-0.1.jar
whereas the DistributedCache API expects it to be a filesystem-based path (i.e. /chukwa/demux/data-collection-demux-0.1.jar).
I'm not sure why, but the FileStatus object returned by FileSystem.listStatus is returning
a URL-based path instead of a filesystem-based path.
> I kludged the Demux class' addParsers to strip the "hdfs://localhost:9000" portion of
the string and now my class is found. I will attempt to provide a patch today that determines
the value of Hadoop's and strips that from the value returned in

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message