tez-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bikas Saha <bi...@hortonworks.com>
Subject RE: Problems with earlier Hadoop versions
Date Tue, 17 Feb 2015 22:26:07 GMT
So your use case involves getting user jars on the fly via cmd line and using them in the job?
Then 2) would be the solution. For all your standard jars etc. you could use 1) and configure
them for everyone in standard manner.

Tez does not have a client side tool since the actual engines would have their own interfaces.
TezClient API allows engines to interface their existing client code with Tez. The client
side code needs to construct the DAG and essentially that is why its application dependent.
Hence there is no generic client.

There is some work in progress to write a client side artifact that can take a DAG represented
as text and submit it. This would be kind of a generic client side tool that could do things
like add jars etc based on cmd line but it still needs a text format DAG to be specified.
Its unclear how to handle the binary payloads in this text format. Tracked in TEZ-1100.

Bikas

-----Original Message-----
From: Kostas Tzoumas [mailto:ktzoumas@apache.org] 
Sent: Tuesday, February 17, 2015 12:52 PM
To: dev@tez.apache.org
Subject: Re: Problems with earlier Hadoop versions

Hi Bikas,

I guess option 2 above would be the way to go for passing the user code jars for an individual
job.

I need to solve this problem for Flink. Basically, I would either need a dedicated Flink client
for the Tez backend (which would have to redo a lot of the work that the Hadoop CLI does),
or make it possible to ship around jars that are provided as command-line arguments to the
hadoop jar command.
I would like to start with the second option, and I would be happy to contribute back if the
solution is generic. Is this currently being implemented? If not, can you point me to the
JIRA and any prior work or design on this?

I will test again with a fresh Hadoop 2.4.0 cluster on Google cloud and report back if I can
reproduce the version incompatibility bug there.

Sorry for discussing two topics in this thread, I will start new threads when I have more
to report.


On Mon, Feb 16, 2015 at 8:32 PM, Bikas Saha <bikas@hortonworks.com> wrote:

> About the classpath, tez-examples work because they are part of the 
> tez jars which are available in tez.lib.uris. There is an open jira to 
> add code to the examples to show how to add other jars.
>
> There are different ways to add jars. You can add jars for all 
> vertices via
> 1) In the tez configuration file, "tez.aux.uris" can be used to 
> specify location of user jars like tez.lib.uris is used for tez jars. 
> These will be localized to the AM and all containers for all jobs 
> using that tez configuration file.
> 2) If jars need to be different per DAG then you can use 
> DAG.addTaskLocalFiles(). These files will go to all vertices of that DAG.
> 3) If jars need to be different per Vertex then you can use Vertex.
> getTaskLocalFiles(). These will be localized to tasks of that vertex.
> Different jars for different vertices will disable container reuse 
> across those vertices.
>
>
> Bikas
>
> -----Original Message-----
> From: Kostas Tzoumas [mailto:ktzoumas@apache.org]
> Sent: Monday, February 16, 2015 10:59 AM
> To: dev@tez.apache.org
> Subject: Re: Problems with earlier Hadoop versions
>
> Hi,
>
> I tried again with the full tarball and I am getting the same error. 
> The code was indeed compiled on a different machine, this might be the issue.
>
> Another issue I am having is related to classpath issues. How is the 
> user code (for example in the Tez examples) loaded into the classpath 
> of the containers that the job runs? Is there some functionality in 
> TezClient that I can use to ship the user code classes to the 
> containers they will be executed in?
>
> Best,
> Kostas
>
> On Fri, Feb 13, 2015 at 7:09 AM, Hitesh Shah <hitesh@apache.org> wrote:
>
> > Hi Kostas,
> >
> > 2.4.0 is something which we are looking to support as I believe 
> > there are quite a few users using it currently. Based on an email 
> > survey I sent out sometime back, I think there were users using a 
> > wide spectrum from 2.2 onwards. Though, at some point down the line, 
> > we would like to stop supporting 2.2 depending on the level of 
> > complexity reached to support the various hadoop versions and their divergent feature
sets.
> >
> > IAC, with respect to what you are seeing, at times, there are 
> > changes which go in that break builds against older versions of 
> > hadoop unintentionally. Most of the developers usually tend to use 
> > the latest version of hadoop ( and also the recent change to make 
> > 2.6.0 default ) has not helped on that front. The 2.6.0 change was 
> > mainly made with respect to the Tez UI and the dependence on YARN 
> > timeline server. From a security point of view, neither 2.4 nor 2.5 
> > have a proper secure Timeline server implementation in place.
> >
> > In most cases, we have a couple of daily builds ( 
> > https://builds.apache.org/job/Tez-Build-Hadoop-2.4/ and 
> > https://builds.apache.org/job/Tez-Build-Hadoop-2.2/ ) which usually 
> > catch these issues though the turnaround time on these bugs is 
> > dependent on someone picking them up quickly. Filed TEZ-2095 for the 
> > build failure - introduced a couple of days back.
> >
> > However, the 0.6.0 release is something that I don’t believe most of 
> > us were aware of. I just tried a local deploy of 0.6.0 and hadoop
> > 2.4.0 and did not hit any issues with the simple orderedwordcount 
> > job that I ran though I tried with the full tarball and not the 
> > minimal one. Would you mind trying with the full tarball and let me 
> > know whether the error is reproducible? The only thing I can think 
> > of here is that there is an incompatibility between what was 
> > compiled against and what is on the classpath on the cluster.
> >
> > thanks
> > — Hitesh
> >
> >
> > On Feb 12, 2015, at 3:32 AM, Kostas Tzoumas <ktzoumas@apache.org> wrote:
> >
> > > Hi folks,
> > >
> > > I would like to report the following:
> > >
> > > (1) the current master does not compile with hadoop 2.4.0
> > >
> > > (mvn -DskipTests clean package -Dhadoop.version=2.4.0 -Phadoop24
> > > -P\!hadoop26)
> > >
> > > [ERROR] Failed to execute goal
> > > org.apache.maven.plugins:maven-compiler-plugin:3.1:testCompile
> > > (default-testCompile) on project tez-dag: Compilation failure 
> > > [ERROR]
> > >
> > /.../tez/tez-dag/src/test/java/org/apache/tez/dag/app/TestTaskAttemp
> > tL
> > istenerImplTezDag.java:[161,42]
> > > cannot find symbol
> > > [ERROR] symbol:   method
> > >
> > newContainerId(org.apache.hadoop.yarn.api.records.ApplicationAttempt
> > Id
> > ,long)
> > > [ERROR] location: class
> > > org.apache.hadoop.yarn.api.records.ContainerId
> > >
> > > (2) I am trying to run the examples of tez 0.6.0 on a cluster with 
> > > hadoop
> > > 2.4.0 and I am getting runtime exceptions:
> > >
> > > 2015-02-12 11:59:01,353 FATAL [main] app.DAGAppMaster: Error 
> > > starting DAGAppMaster
> > > java.lang.AbstractMethodError:
> > > org.apache.hadoop.yarn.api.records.ContainerId.setContainerId(J)V
> > >        at
> > >
> > org.apache.hadoop.yarn.api.records.ContainerId.newInstance(Container
> > Id
> > .java:60)
> > >        at
> > >
> > org.apache.hadoop.yarn.util.ConverterUtils.toContainerId(ConverterUt
> > il
> > s.java:178)
> > >        at
> > org.apache.tez.dag.app.DAGAppMaster.main(DAGAppMaster.java:1821)
> > >
> > > The error appears when both using the cluster's hadoop jars (with 
> > > the tez-minimal jar) and using the hadoop jars shipped with tez-0.6.0.
> > >
> > > I compiled the 0.6.0 release with the -Dhadoop.version=2.4.0
> > > -Phadoop24
> > > -P\!hadoop26 options. I also tried editing the pom file and 
> > > changed the hadoop version and removed tez-plugins.
> > >
> > > Any advice? Is hadoop 2.4.0 supported in the long term, or would 
> > > you recommend to upgrade to 2.6.0?
> > >
> > > Best,
> > > Kostas
> >
> >
>
Mime
View raw message