tez-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jeff Zhang <zjf...@gmail.com>
Subject Re: Should we display the real exception in client side if job failed ?
Date Thu, 26 Jun 2014 08:47:34 GMT
Besides, IMHO if user forget to specify the LocalResource for his
Processor, it should cause a TaskAttempt Fail rather than Container Fail.


On Thu, Jun 26, 2014 at 4:43 PM, Jeff Zhang <zjffdu@gmail.com> wrote:

> I did some experiment by making some code change to send the real
> exception to client side. Let me know your comments whether this is
> valuable to fix it.
>
>
> On Thu, Jun 26, 2014 at 3:37 PM, Jeff Zhang <zjffdu@gmail.com> wrote:
>
>> Hi all,
>>
>> I have a tez job which is failed due to that I didn't put my jar to the
>> local resources. But on the client side, the exception is not clear for me
>> to figure what's wrong with it. The real reason is that It couldn't load
>> the Processor class. I have to run command "yarn logs" to find the real
>> exception in the container logs.  I also have another case that has
>> exception in the my Processor, the message on the client side is still not
>> clear to me. I think that should we pass the real exception to the
>> diagnostics and display it in client side, this should help user to find
>> out what's wrong with their program. Let me know your comments, thanks (
>> following is the logs in client side and container )
>>
>>
>> *Exception on client side*
>>
>> 14/06/26 14:57:15 INFO rpc.DAGClientRPCImpl: VertexStatus: VertexName:
>> summer Progress: 0% TotalTasks: 1 Succeeded: 0 Running: 0 Failed: 0 Killed:
>> 114/06/26 14:57:15 INFO rpc.DAGClientRPCImpl: VertexStatus: VertexName:
>> tokenizer Progress: 0% TotalTasks: 1 Succeeded: 0 Running: 0 Failed: 1
>> Killed: 014/06/26 14:57:15 INFO rpc.DAGClientRPCImpl: DAG completed.
>> FinalState=FAILEDDAG diagnostics:[Vertex failed, vertexName=tokenizer,
>> vertexId=vertex_1403765612557_0004_1_00, diagnostics=[Task failed,
>> taskId=task_1403765612557_0004_1_00_000000, diagnostics=[TaskAttempt 0
>> failed, info=[Container container_1403765612557_0004_01_000002 COMPLETED
>> with diagnostics set to [Exception from container-launch:
>> org.apache.hadoop.util.Shell$ExitCodeException:
>> org.apache.hadoop.util.Shell$ExitCodeException: at
>> org.apache.hadoop.util.Shell.runCommand(Shell.java:505)
>>
>> at org.apache.hadoop.util.Shell.run(Shell.java:418)
>>
>> at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(
>> Shell.java:650)
>>
>> at
>> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(
>> DefaultContainerExecutor.java:195)
>>
>> at
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(
>> ContainerLaunch.java:300)
>>
>> at
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(
>> ContainerLaunch.java:81)
>>
>> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>>
>> at java.util.concurrent.ThreadPoolExecutor.runWorker(
>> ThreadPoolExecutor.java:1145)
>>
>> at java.util.concurrent.ThreadPoolExecutor$Worker.run(
>> ThreadPoolExecutor.java:615)
>>
>> at java.lang.Thread.run(Thread.java:745)
>>
>> Container exited with a non-zero exit code 1
>> *The reason exception:*
>>
>> 2014-06-26 14:57:02,146 ERROR [main]
>> org.apache.hadoop.yarn.YarnUncaughtExceptionHandler: Thread
>> Thread[main,5,main] threw an Exception.
>> org.apache.tez.dag.api.TezUncheckedException: Unable to load class:
>> com.zjffdu.tutorial.tez.WordCount$TokenProcessor
>>     at org.apache.tez.common.RuntimeUtils.getClazz(RuntimeUtils.java:44)
>>     at
>> org.apache.tez.common.RuntimeUtils.createClazzInstance(RuntimeUtils.java:66)
>>     at
>> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.createProcessor(LogicalIOProcessorRuntimeTask.java:533)
>>     at
>> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.<init>(LogicalIOProcessorRuntimeTask.java:146)
>>     at
>> org.apache.tez.runtime.task.TezTaskRunner.<init>(TezTaskRunner.java:78)
>>     at org.apache.tez.runtime.task.TezChild.run(TezChild.java:208)
>>     at org.apache.tez.runtime.task.TezChild.main(TezChild.java:363)
>>
>>
>> --
>> Best Regards
>>
>> Jeff Zhang
>>
>
>
>
> --
> Best Regards
>
> Jeff Zhang
>



-- 
Best Regards

Jeff Zhang

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message