tez-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Hitesh Shah <hit...@apache.org>
Subject Re: ClassNotFoundException with custom InputFormat.
Date Thu, 18 Jun 2015 16:57:26 GMT
Hi Andre 

Are you using Local Resource type ARCHIVE? Using FILE may not help in your scenario.

If you are using ARCHIVE, you can then use the classpath config ( TEZ_CLUSTER_ADDITIONAL_CLASSPATH_PREFIX
) to modify the classpath. 
      
 For example, assume foo.jar and bar.jar ( in the structure that you called out ) are added
to the map of local resources using keys foo and bar: 
      - classpath prefix would be “$PWD/foo/*:$PWD/foo/lib/*:$PWD/bar/*:$PWD/bar/lib/*:”


As mentioned on the jira, the launch_container.sh from your cluster would help. Also, if you
upload an example jar to the jira, I can help provide a working example. 

thanks
— Hitesh


On Jun 18, 2015, at 9:40 AM, Andre Kelpe <akelpe@concurrentinc.com> wrote:

> On Wed, Jun 17, 2015 at 4:58 PM, Bikas Saha <bikas@hortonworks.com> wrote:
> 
>> If I understand this right, there is a jar with user code in it. The jar
>> needs to be available during split creation but it is not available.
>> 
>> 
>> 
>> Is split creation happening on the client or on the AM. If its happening
>> on the AM, and the AM is not getting the jars then how are you specifying
>> the jars to be sent to the AM. There are different ways to do it.
>> 
> 
> In our case the AM is doing the split calculation. We are sending the jar
> over as LocalResources given in the TezClient#create method
> 
> 
>> 1)      Set tez.aux.uris in tez-site.xml to an HDFS location and copy
>> user jars there
>> 
>> 2)      Upload the user jar to HDFS and create a YARN local resource for
>> it. Then use either of the following to add the local resource to the
>> AM/DAG that needs it.
>> 
>> a.       TezClient#addAppMasterLocalFiles(…)
>> 
>> b.      DAG#addTaskLocalFiles(…)
>> 
>> 
>> 
>> Not sure what is meant by classic Hadoop style jars?
>> 
> 
> Hadoop style jars are jar files, where you have the user code + all
> required libs in a sub-directory within the jar. The layout that RunJar
> understands since forever.
> 
> The thing is that we can't find a way to put the jars in the lib folder in
> the job-jar on the classpath of the AM.
> 
> - André
> 
> 
> 
>> 
>> 
>> Bikas
>> 
>> 
>> 
>> *From:* Chris K Wensel [mailto:chris@wensel.net]
>> *Sent:* Wednesday, June 17, 2015 4:41 PM
>> *To:* dev@tez.apache.org
>> *Cc:* user@tez.apache.org
>> *Subject:* Re: ClassNotFoundException with custom InputFormat.
>> 
>> 
>> 
>> cross posting down to dev… should continue the discussion there I believe.
>> 
>> 
>> 
>> as I understand it, all Cascading users familiar with packaging a Hadoop
>> job jar with a lib folder, in which the packaged custom InputFormat is
>> placed — pulled from maven etc, will have this issue.
>> 
>> 
>> 
>> this also expands to projects on top of Cascading including Scalding and
>> Cascalog.
>> 
>> 
>> 
>> oddly the org.apache.tez.client.AMConfiguration has a
>> 
>> 
>> 
>> private Map<String, String> env;
>> 
>> 
>> 
>> but is unused.
>> 
>> 
>> 
>> On Jun 17, 2015, at 4:32 PM, Andre Kelpe <akelpe@concurrentinc.com>
>> wrote:
>> 
>> 
>> 
>> Hi,
>> 
>> we are currently running into a problem when a user of Cascading uses a
>> custom InputFormat with Tez. The ApplicationMaster is running into a
>> ClassNotFoundException when calculating the splits, since we are unable to
>> control the environment/classpath visibile to the ApplicationMaster. We
>> have a work-around, where the users have to supply a fat-jar to make it
>> work, but we need to be able to support other ways as well.
>> 
>> When interacting with the DAG, we are able to pass along a custom
>> environment/classpath, but that API is missing on the TezClient, causing
>> the AppMaster to fail, when the user is using classic hadoop style jars
>> (embedded lib directory).
>> 
>> In order to get lingual, our SQL layer on top of Cascading to work
>> correctly, we need a way to supply the environment in a more dynamic way
>> then one fatjar, so it would be great if the API could be extendend to do
>> that.
>> 
>> I have opened https://issues.apache.org/jira/browse/TEZ-2563
>> 
>> Thanks!
>> 
>> 
>> 
>> - André
>> 
>> 
>> --
>> 
>> André Kelpe
>> andre@concurrentinc.com
>> http://concurrentinc.com
>> 
>> 
>> 
>> —
>> 
>> Chris K Wensel
>> 
>> chris@wensel.net
>> 
>> 
>> 
>> 
>> 
>> 
>> 
> 
> 
> 
> -- 
> André Kelpe
> andre@concurrentinc.com
> http://concurrentinc.com


Mime
View raw message