hadoop-mapreduce-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Arun C Murthy <...@hortonworks.com>
Subject Re: YARN, LocalResource, and Classpaths
Date Wed, 21 Sep 2011 04:46:45 GMT
Good tip.

Can you please add it to the FAQ Chris started on http://wiki.apache.org/hadoop/WritingYarnApps?
(Oh, thanks Chris!)

Arun

On Sep 20, 2011, at 7:29 PM, Hitesh Shah wrote:

> Chris, 
> 
> Another small note in case you have not already figured it out. 
> 
> It should likely be possible to debug the application master by running it directly (
instead of the multi-step client job submission approach). The only thing to note is you will
need to comment out the code that does the app master's registration to the resource manager
and likely use a different fail count on each run to generate a different attempt id. 
> 
> -- Hitesh 
> 
> On Sep 20, 2011, at 2:35 PM, Chris Riccomini wrote:
> 
>> Hey Arun,
>> 
>> For sure. Any preferred location?
>> 
>> Cheers,
>> Chris
>> ________________________________________
>> From: Arun C Murthy [acm@hortonworks.com]
>> Sent: Tuesday, September 20, 2011 2:32 PM
>> To: mapreduce-dev@hadoop.apache.org
>> Subject: Re: YARN, LocalResource, and Classpaths
>> 
>> Thanks Chris! We'd love to have you put up a README wiki to log your adventures for
posterity... interested?
>> 
>> On Sep 20, 2011, at 2:20 PM, Chris Riccomini wrote:
>> 
>>> Hey Guys,
>>> 
>>> Thanks for the help. For the record, this solved the problem:
>>> 
>>>  new ApplicationMasterExecutor(packagePath)
>>>    .addCommand("java -cp './package/*' kafka.yarn.ApplicationMaster " + streamerClass
+ " " + tasks + " "
>>>      + "1>" + ApplicationConstants.LOG_DIR_EXPANSION_VAR + "/stdout "
>>>      + "2>" + ApplicationConstants.LOG_DIR_EXPANSION_VAR + "/stderr")
>>>    .execute(new Configuration)
>>> 
>>> Java is so sensitive about the classpath. Sigh.
>>> 
>>> Cheers,
>>> Chris
>>> ________________________________________
>>> From: Chris Riccomini [criccomini@linkedin.com]
>>> Sent: Tuesday, September 20, 2011 12:59 PM
>>> To: mapreduce-dev@hadoop.apache.org
>>> Subject: RE: YARN, LocalResource, and Classpaths
>>> 
>>> Hey Vinod,
>>> 
>>> Interesting follow up.
>>> 
>>> find ./package/*
>>> 
>>> Shows:
>>> 
>>> package/activation-1.1.jar
>>> package/aopalliance-1.0.jar
>>> package/asm-3.2.jar
>>> package/aspectjrt-1.6.5.jar
>>> package/avro-1.5.3.jar
>>> package/avro-ipc-1.5.3.jar
>>> ...
>>> 
>>> Appears it's unzipped, after all. I think the :./package/*: syntax is busted
for me, fixing.
>>> 
>>> Cheers,
>>> Chris
>>> ________________________________________
>>> From: Chris Riccomini [criccomini@linkedin.com]
>>> Sent: Tuesday, September 20, 2011 12:49 PM
>>> To: mapreduce-dev@hadoop.apache.org
>>> Subject: RE: YARN, LocalResource, and Classpaths
>>> 
>>> Hey Vinod,
>>> 
>>> Yea, I considered that.
>>> 
>>> find .
>>> 
>>> .
>>> ./.task.sh.crc
>>> ./.container_tokens.crc
>>> ./container_tokens
>>> ./task.sh
>>> ./package
>>> 
>>> ls -l package
>>> 
>>> lrwxrwxrwx 1 criccomi eng 125 Sep 20 12:47 package -> /tmp/nm-local-dir/usercache/criccomi/appcache/application_1316468926404_0016/filecache/2573972455544981641/kafka-streamer.tgz
>>> 
>>> Am I supposed to unzip the tgz myself? I saw code in the FSDownload to unpack
it, I thought.
>>> 
>>> Thanks!
>>> Chris
>>> ________________________________________
>>> From: Vinod Kumar Vavilapalli [vinodkv@hortonworks.com]
>>> Sent: Tuesday, September 20, 2011 11:05 AM
>>> To: mapreduce-dev@hadoop.apache.org
>>> Subject: Re: YARN, LocalResource, and Classpaths
>>> 
>>> The .tgz archives are untarred and symlinked to in the working directory. In
>>> your code, you mentioned the key to be "package" in the resource
>>> description. I'd suggest you try to add files under the "package" dir
>>> (actually a symlink). Also, it really depends on how you package really is
>>> built.
>>> 
>>> For more debugging, I'd replace the java command with a shell command and
>>> list the files recursively under the current work-dir to stdout.
>>> 
>>> HTH,
>>> +Vinod
>>> 
>>> 
>>> On Tue, Sep 20, 2011 at 10:56 PM, Chris Riccomini
>>> <criccomini@linkedin.com>wrote:
>>> 
>>>> Hey All,
>>>> 
>>>> I'm trying to use YARN's LocalResource stuff to ship a tgz file that has
>>>> all the jars that I need on my Classpath (for my ApplicationMaster and my
>>>> container tasks). I'm having trouble figuring out what my client should put
>>>> in the ApplicationMaster exec command that it sends to the Resource manager.
>>>> I tried -cp :./*:, but it didn't seem to pick anything up.
>>>> 
>>>> Here are some code snippets:
>>>> 
>>>> // tell resource manager to execute kafka's application master
>>>> new ApplicationMasterExecutor(packagePath)
>>>>   .addCommand("java -cp :./*: kafka.yarn.ApplicationMaster " +
>>>> streamerClass + " " + tasks + " "
>>>>     + "1>" + ApplicationConstants.LOG_DIR_EXPANSION_VAR + "/stdout "
>>>>     + "2>" + ApplicationConstants.LOG_DIR_EXPANSION_VAR + "/stderr")
>>>>   .execute(new Configuration)
>>>> 
>>>> When I run that, I see in my log output dir that it could not find
>>>> kafka.yarn.ApplicationMaster. I'm the code sample above, packagePath is a
>>>> local FS path to my .tgz file. It's uploaded by my ApplicationMasterExecutor
>>>> with:
>>>> 
>>>> val packageFile = new File(packagePath);
>>>> val packageUrl =
>>>> ConverterUtils.getYarnUrlFromPath(FileContext.getFileContext.makeQualified(new
>>>> Path(packagePath)));
>>>> 
>>>> packageResource.setResource(packageUrl);
>>>> packageResource.setSize(packageFile.length());
>>>> packageResource.setTimestamp(packageFile.lastModified());
>>>> packageResource.setType(LocalResourceType.ARCHIVE);
>>>> packageResource.setVisibility(LocalResourceVisibility.APPLICATION);
>>>> 
>>>> resource.setMemory(memory)
>>>> containerCtx.setResource(resource)
>>>> containerCtx.setCommands(cmds.toList)
>>>> containerCtx.setLocalResources(Collections.singletonMap("package",
>>>> packageResource))
>>>> appCtx.setApplicationId(appId)
>>>> appCtx.setUser(user.getShortUserName)
>>>> appCtx.setAMContainerSpec(containerCtx)
>>>> request.setApplicationSubmissionContext(appCtx)
>>>> applicationsManager.submitApplication(request)
>>>> 
>>>> When I poke around in /tmp/nm-local-dir/usercache/criccomi, I see nothing
>>>> in there after a run. Before, when I was running with visibility set to USER
>>>> (instead of APPLICATION), I could see the unzipped data in the filecache,
>>>> but that's gone now that I switched to APPLICATION.
>>>> 
>>>> I'm probably just not putting the proper classpath. How do I do this?
>>>> 
>>>> Cheers,
>>>> Chris
>>>> 
>> 
> 


Mime
View raw message