hadoop-mapreduce-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Arun C Murthy <...@hortonworks.com>
Subject Re: YARN, LocalResource, and Classpaths
Date Tue, 20 Sep 2011 21:56:10 GMT
Feel free to use http://wiki.apache.org/hadoop/WritingYarnApps.

thanks,
Arun

On Sep 20, 2011, at 2:35 PM, Chris Riccomini wrote:

> Hey Arun,
> 
> For sure. Any preferred location?
> 
> Cheers,
> Chris
> ________________________________________
> From: Arun C Murthy [acm@hortonworks.com]
> Sent: Tuesday, September 20, 2011 2:32 PM
> To: mapreduce-dev@hadoop.apache.org
> Subject: Re: YARN, LocalResource, and Classpaths
> 
> Thanks Chris! We'd love to have you put up a README wiki to log your adventures for posterity...
interested?
> 
> On Sep 20, 2011, at 2:20 PM, Chris Riccomini wrote:
> 
>> Hey Guys,
>> 
>> Thanks for the help. For the record, this solved the problem:
>> 
>>   new ApplicationMasterExecutor(packagePath)
>>     .addCommand("java -cp './package/*' kafka.yarn.ApplicationMaster " + streamerClass
+ " " + tasks + " "
>>       + "1>" + ApplicationConstants.LOG_DIR_EXPANSION_VAR + "/stdout "
>>       + "2>" + ApplicationConstants.LOG_DIR_EXPANSION_VAR + "/stderr")
>>     .execute(new Configuration)
>> 
>> Java is so sensitive about the classpath. Sigh.
>> 
>> Cheers,
>> Chris
>> ________________________________________
>> From: Chris Riccomini [criccomini@linkedin.com]
>> Sent: Tuesday, September 20, 2011 12:59 PM
>> To: mapreduce-dev@hadoop.apache.org
>> Subject: RE: YARN, LocalResource, and Classpaths
>> 
>> Hey Vinod,
>> 
>> Interesting follow up.
>> 
>> find ./package/*
>> 
>> Shows:
>> 
>> package/activation-1.1.jar
>> package/aopalliance-1.0.jar
>> package/asm-3.2.jar
>> package/aspectjrt-1.6.5.jar
>> package/avro-1.5.3.jar
>> package/avro-ipc-1.5.3.jar
>> ...
>> 
>> Appears it's unzipped, after all. I think the :./package/*: syntax is busted for
me, fixing.
>> 
>> Cheers,
>> Chris
>> ________________________________________
>> From: Chris Riccomini [criccomini@linkedin.com]
>> Sent: Tuesday, September 20, 2011 12:49 PM
>> To: mapreduce-dev@hadoop.apache.org
>> Subject: RE: YARN, LocalResource, and Classpaths
>> 
>> Hey Vinod,
>> 
>> Yea, I considered that.
>> 
>> find .
>> 
>> .
>> ./.task.sh.crc
>> ./.container_tokens.crc
>> ./container_tokens
>> ./task.sh
>> ./package
>> 
>> ls -l package
>> 
>> lrwxrwxrwx 1 criccomi eng 125 Sep 20 12:47 package -> /tmp/nm-local-dir/usercache/criccomi/appcache/application_1316468926404_0016/filecache/2573972455544981641/kafka-streamer.tgz
>> 
>> Am I supposed to unzip the tgz myself? I saw code in the FSDownload to unpack it,
I thought.
>> 
>> Thanks!
>> Chris
>> ________________________________________
>> From: Vinod Kumar Vavilapalli [vinodkv@hortonworks.com]
>> Sent: Tuesday, September 20, 2011 11:05 AM
>> To: mapreduce-dev@hadoop.apache.org
>> Subject: Re: YARN, LocalResource, and Classpaths
>> 
>> The .tgz archives are untarred and symlinked to in the working directory. In
>> your code, you mentioned the key to be "package" in the resource
>> description. I'd suggest you try to add files under the "package" dir
>> (actually a symlink). Also, it really depends on how you package really is
>> built.
>> 
>> For more debugging, I'd replace the java command with a shell command and
>> list the files recursively under the current work-dir to stdout.
>> 
>> HTH,
>> +Vinod
>> 
>> 
>> On Tue, Sep 20, 2011 at 10:56 PM, Chris Riccomini
>> <criccomini@linkedin.com>wrote:
>> 
>>> Hey All,
>>> 
>>> I'm trying to use YARN's LocalResource stuff to ship a tgz file that has
>>> all the jars that I need on my Classpath (for my ApplicationMaster and my
>>> container tasks). I'm having trouble figuring out what my client should put
>>> in the ApplicationMaster exec command that it sends to the Resource manager.
>>> I tried -cp :./*:, but it didn't seem to pick anything up.
>>> 
>>> Here are some code snippets:
>>> 
>>>  // tell resource manager to execute kafka's application master
>>>  new ApplicationMasterExecutor(packagePath)
>>>    .addCommand("java -cp :./*: kafka.yarn.ApplicationMaster " +
>>> streamerClass + " " + tasks + " "
>>>      + "1>" + ApplicationConstants.LOG_DIR_EXPANSION_VAR + "/stdout "
>>>      + "2>" + ApplicationConstants.LOG_DIR_EXPANSION_VAR + "/stderr")
>>>    .execute(new Configuration)
>>> 
>>> When I run that, I see in my log output dir that it could not find
>>> kafka.yarn.ApplicationMaster. I'm the code sample above, packagePath is a
>>> local FS path to my .tgz file. It's uploaded by my ApplicationMasterExecutor
>>> with:
>>> 
>>>  val packageFile = new File(packagePath);
>>>  val packageUrl =
>>> ConverterUtils.getYarnUrlFromPath(FileContext.getFileContext.makeQualified(new
>>> Path(packagePath)));
>>> 
>>>  packageResource.setResource(packageUrl);
>>>  packageResource.setSize(packageFile.length());
>>>  packageResource.setTimestamp(packageFile.lastModified());
>>>  packageResource.setType(LocalResourceType.ARCHIVE);
>>>  packageResource.setVisibility(LocalResourceVisibility.APPLICATION);
>>> 
>>>  resource.setMemory(memory)
>>>  containerCtx.setResource(resource)
>>>  containerCtx.setCommands(cmds.toList)
>>>  containerCtx.setLocalResources(Collections.singletonMap("package",
>>> packageResource))
>>>  appCtx.setApplicationId(appId)
>>>  appCtx.setUser(user.getShortUserName)
>>>  appCtx.setAMContainerSpec(containerCtx)
>>>  request.setApplicationSubmissionContext(appCtx)
>>>  applicationsManager.submitApplication(request)
>>> 
>>> When I poke around in /tmp/nm-local-dir/usercache/criccomi, I see nothing
>>> in there after a run. Before, when I was running with visibility set to USER
>>> (instead of APPLICATION), I could see the unzipped data in the filecache,
>>> but that's gone now that I switched to APPLICATION.
>>> 
>>> I'm probably just not putting the proper classpath. How do I do this?
>>> 
>>> Cheers,
>>> Chris
>>> 
> 


Mime
View raw message