Ok...I think I figured it out.  This email thread made me take a look at how I'm kicking off my hadoop job.  My hadoop driver, the class that links a bunch of jobs together in a workflow, is on a different machine than the cluster that hadoop is running on.  This means when I create a new Configuration() object it, it tries to load the default hadoop values from the class path, but since the driver isnt running on the hadoop cluster and doesnt have access to the hadoop cluster's configuration files, it just uses the default vales...config for suck.

So I copied the *-site.xml files from my namenode over to the machine my hadoop job driver was running from and put it in the class path, and shazam...it picked up the hadoop config that whirr created for me.  yay!



On Wed, Oct 5, 2011 at 10:49 AM, Andrei Savu <savu.andrei@gmail.com> wrote:

On Wed, Oct 5, 2011 at 8:41 PM, John Conwell <john@iamjohn.me> wrote:
It looks like hadoop is reading default configuration values from somewhere and using them, and not reading from the /usr/lib/hadoop/conf/*-site.xml files.

If you are running CDH the config files are in:

HADOOP=hadoop-${HADOOP_VERSION:-0.20}
HADOOP_CONF_DIR=/etc/$HADOOP/conf.dist





--

Thanks,
John C