tez-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Daniel Dai <da...@hortonworks.com>
Subject Re: meet a problem when using "pig -x tez_local"
Date Fri, 06 Jun 2014 22:22:16 GMT
If there is no OrderBy and SkewedJoin, you can comment out everything
inside collectSample, and give a try. The cached sample is only used
in those operators.

Thanks,
Daniel

On Fri, Jun 6, 2014 at 1:10 PM, Chen He <airbots@gmail.com> wrote:
> Hi Daniel
>
> The same error even I comment "Ordered by". I read the whole log. There are
> 7 vertices created by the script1-local.pig file.
> scope-72, scpoe-75, scpoe-84, scope-85, scope-93, scope-104, and scpope-106.
> scope-72 is the root and it reads input from disk. This vertex successfully
> finished and I saw the output in disk;
> after scope-72, it directly jump to scope-104 which needs input from
> scope-85 and scope-93. Scope-104 uses
> broadcastShuffle to get data from scope-85 and scope-93. Scope-104 connect
> to scope-106 which is the final output vertex.
>
> The NPE occurs where scope-104 starts its first task attempt.
> I am not sure why there is not log record about scope-75 and scope-84.
>
> Chen
>
>
> On Fri, Jun 6, 2014 at 1:40 PM, Chen He <airbots@gmail.com> wrote:
>
>> Hi Daniel
>>
>> Thank you for the update. I will comment that line when I run it.
>>
>> Regards!
>>
>> Chen
>>
>>
>> On Fri, Jun 6, 2014 at 1:19 PM, Daniel Dai <daijy@hortonworks.com> wrote:
>>
>>> There are several fix for Order by in PIG-3846. I will try to commit
>>> it soon and hope it solves the problem. Can your focus on
>>> non-orderby/skewedjoin queries for now?
>>>
>>> Thanks,
>>> Daniel
>>>
>>> On Fri, Jun 6, 2014 at 7:12 AM, Chen He <airbots@gmail.com> wrote:
>>> > Hi Daniel
>>> >
>>> > I saw there is a line using "ordered by" in the script1-local.pig file.
>>> >
>>> > "ordered_uniq_frequency = ORDER filtered_uniq_frequency BY hour, score;"
>>> >
>>> > Regards!
>>> >
>>> > Chen
>>> >
>>> >
>>> > On Fri, Jun 6, 2014 at 12:17 AM, Chen He <airbots@gmail.com> wrote:
>>> >
>>> >> Thank you for the reply, Daniel. I will check tomorrow. Actually, i
>>> just
>>> >> use tutorial script1-local.
>>> >>
>>> >> Regards!
>>> >>
>>> >> Chen
>>> >> On Jun 5, 2014 11:42 PM, "Daniel Dai" <daijy@hortonworks.com>
wrote:
>>> >>
>>> >>> Hi, Chen,
>>> >>> Are you seeing this issue only when running a query containing "order
>>> >>> by"? It seems a Pig bug on this side, I can take a look tomorrow.
But
>>> >>> in worst case, we can disable ObjectRegistry in local mode.
>>> >>>
>>> >>> Thanks,
>>> >>> Daniel
>>> >>>
>>> >>> On Thu, Jun 5, 2014 at 9:14 PM, Chen He <airbots@gmail.com>
wrote:
>>> >>> > Hi Hitesh
>>> >>> >
>>> >>> > ObjectRegistry is used for broadcastShuffle. If a vertex has
more
>>> than
>>> >>> one
>>> >>> > task. Pig uses ObjectRegistry to save objects that can be used
for
>>> all
>>> >>> > tasks in a vertex.  The ObjectRegistryImpl's annotation is
>>> singleton. I
>>> >>> > suspect it is a race condition between constructor and caller.
>>> >>> >
>>> >>> > Regards!
>>> >>> >
>>> >>> > Chen
>>> >>> >
>>> >>> >
>>> >>> > On Thu, Jun 5, 2014 at 8:49 PM, Hitesh Shah <hitesh@apache.org>
>>> wrote:
>>> >>> >
>>> >>> >> @Chen, I am not sure asking for the user code to change
is an
>>> option.
>>> >>> We
>>> >>> >> should address the underlying issue which is fixing ObjectRegistry
>>> to
>>> >>> work
>>> >>> >> in local mode.
>>> >>> >>
>>> >>> >> thanks
>>> >>> >> -- Hitesh
>>> >>> >>
>>> >>> >>
>>> >>> >> On Thu, Jun 5, 2014 at 1:34 PM, Chen He <airbots@gmail.com>
wrote:
>>> >>> >>
>>> >>> >> > Hi Hitesh
>>> >>> >> >
>>> >>> >> > In my code, I did not touch any ObjectRegistry related
code. I
>>> >>> suspect
>>> >>> >> the
>>> >>> >> > instantiation process of ObjectRegistry  has problem
so that it
>>> >>> failed to
>>> >>> >> > create an instance when it is called.
>>> >>> >> >
>>> >>> >> > I didn't know why PigProcessor collectSample vertex.
But looks
>>> like
>>> >>> it is
>>> >>> >> > not a must for TezLocalMode.
>>> >>> >> >
>>> >>> >> > Chen
>>> >>> >> >
>>> >>> >> >
>>> >>> >> > On Thu, Jun 5, 2014 at 2:59 PM, Chen He <airbots@gmail.com>
>>> wrote:
>>> >>> >> >
>>> >>> >> > > Hi Hitesh
>>> >>> >> > >
>>> >>> >> > > Yes, objectRegistry returns NPE. I saw it is
a inject static
>>> >>> variable.
>>> >>> >> > >
>>> >>> >> > > Regards!
>>> >>> >> > >
>>> >>> >> > > Chen
>>> >>> >> > >
>>> >>> >> > >
>>> >>> >> > > On Thu, Jun 5, 2014 at 9:15 AM, Hitesh Shah <hitesh@apache.org
>>> >
>>> >>> wrote:
>>> >>> >> > >
>>> >>> >> > >> @Chen, this might be a bug in the tez local
execution code.
>>> Can
>>> >>> you
>>> >>> >> > >> confirm
>>> >>> >> > >> that ObjectRegistryFactory::getObjectRegistry
returns a
>>> non-null
>>> >>> >> value?
>>> >>> >> > >>
>>> >>> >> > >> thanks
>>> >>> >> > >> — Hitesh
>>> >>> >> > >>
>>> >>> >> > >>
>>> >>> >> > >> On Wed, Jun 4, 2014 at 4:21 PM, Chen He <airbots@gmail.com>
>>> >>> wrote:
>>> >>> >> > >>
>>> >>> >> > >> > I am using pig -x tez_local to test
tez local mode and met
>>> >>> following
>>> >>> >> > >> > problems (use tutorial.jar)
>>> >>> >> > >> >
>>> >>> >> > >> > 2014-06-03 11:06:35,964 [pool-1-thread-1]
INFO
>>> >>> >> > >> > org.apache.hadoop.mapred.YarnTezDagChild
- Running task,
>>> >>> >> > >> > taskAttemptId=attempt_1401811588093_0000_1_05_000000_0
>>> >>> >> > >> > 2014-06-03 11:06:35,965 [pool-1-thread-1]
ERROR
>>> >>> >> > >> >
>>> org.apache.pig.backend.hadoop.executionengine.tez.PigProcessor -
>>> >>> >> > >> > Encountered exception while processing:
>>> >>> >> > >> > java.lang.NullPointerException
>>> >>> >> > >> >         at
>>> >>> >> > >> >
>>> >>> >> > >> >
>>> >>> >> > >>
>>> >>> >> >
>>> >>> >>
>>> >>>
>>> org.apache.pig.backend.hadoop.executionengine.tez.ObjectCache.retrieve(ObjectCache.java:47)
>>> >>> >> > >> >         at
>>> >>> >> > >> >
>>> >>> >> > >> >
>>> >>> >> > >>
>>> >>> >> >
>>> >>> >>
>>> >>>
>>> org.apache.pig.backend.hadoop.executionengine.tez.PigProcessor.collectSample(PigProcessor.java:307)
>>> >>> >> > >> >         at
>>> >>> >> > >> >
>>> >>> >> > >> >
>>> >>> >> > >>
>>> >>> >> >
>>> >>> >>
>>> >>>
>>> org.apache.pig.backend.hadoop.executionengine.tez.PigProcessor.initializeInputs(PigProcessor.java:218)
>>> >>> >> > >> >         at
>>> >>> >> > >> >
>>> >>> >> > >> >
>>> >>> >> > >>
>>> >>> >> >
>>> >>> >>
>>> >>>
>>> org.apache.pig.backend.hadoop.executionengine.tez.PigProcessor.run(PigProcessor.java:162)
>>> >>> >> > >> >         at
>>> >>> >> > >> >
>>> >>> >> > >> >
>>> >>> >> > >>
>>> >>> >> >
>>> >>> >>
>>> >>>
>>> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:317)
>>> >>> >> > >> >         at
>>> >>> >> > >> >
>>> >>> >> >
>>> >>>
>>> org.apache.hadoop.mapred.YarnTezDagChild$3.run(YarnTezDagChild.java:602)
>>> >>> >> > >> >         at
>>> java.security.AccessController.doPrivileged(Native
>>> >>> >> Method)
>>> >>> >> > >> >         at
>>> javax.security.auth.Subject.doAs(Subject.java:415)
>>> >>> >> > >> >         at
>>> >>> >> > >> >
>>> >>> >> > >> >
>>> >>> >> > >>
>>> >>> >> >
>>> >>> >>
>>> >>>
>>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
>>> >>> >> > >> >         at
>>> >>> >> > >> >
>>> >>> >> > >>
>>> >>> >> >
>>> >>> >>
>>> >>>
>>> org.apache.hadoop.mapred.YarnTezDagChild.pollTask(YarnTezDagChild.java:591)
>>> >>> >> > >> >         at
>>> >>> >> > >> >
>>> >>> >> > >>
>>> >>> >> >
>>> >>> >>
>>> >>>
>>> org.apache.hadoop.mapred.YarnTezDagChild.runTask(YarnTezDagChild.java:332)
>>> >>> >> > >> >         at
>>> >>> >> > >> >
>>> >>> >> > >> >
>>> >>> >> > >>
>>> >>> >> >
>>> >>> >>
>>> >>>
>>> org.apache.tez.dag.app.launcher.LocalContainerLauncher$1.run(LocalContainerLauncher.java:250)
>>> >>> >> > >> >         at java.lang.Thread.run(Thread.java:744)
>>> >>> >> > >> >         at
>>> >>> >> > >> >
>>> >>> >> > >> >
>>> >>> >> > >>
>>> >>> >> >
>>> >>> >>
>>> >>>
>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>> >>> >> > >> >         at
>>> >>> >> > >> >
>>> >>> >> > >> >
>>> >>> >> > >>
>>> >>> >> >
>>> >>> >>
>>> >>>
>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>> >>> >> > >> >         at java.lang.Thread.run(Thread.java:744)
>>> >>> >> > >> >
>>> >>> >> > >> > code checking implies that the ObjectCache
is trying to get
>>> >>> >> something
>>> >>> >> > >> from
>>> >>> >> > >> > a Map but it is not there.
>>> >>> >> > >> > But what is the purpose of using ObjectCache
in PIG on TEZ?
>>> >>> >> > >> > Why pig needs sampleVertex in the PigProcessor?
>>> >>> >> > >> >
>>> >>> >> > >> > Any reply will be appreciated.
>>> >>> >> > >> >
>>> >>> >> > >> > Regards!
>>> >>> >> > >> >
>>> >>> >> > >> > Chen
>>> >>> >> > >> >
>>> >>> >> > >>
>>> >>> >> > >
>>> >>> >> > >
>>> >>> >> >
>>> >>> >>
>>> >>>
>>> >>> --
>>> >>> CONFIDENTIALITY NOTICE
>>> >>> NOTICE: This message is intended for the use of the individual or
>>> entity
>>> >>> to
>>> >>> which it is addressed and may contain information that is
>>> confidential,
>>> >>> privileged and exempt from disclosure under applicable law. If the
>>> reader
>>> >>> of this message is not the intended recipient, you are hereby notified
>>> >>> that
>>> >>> any printing, copying, dissemination, distribution, disclosure or
>>> >>> forwarding of this communication is strictly prohibited. If you
have
>>> >>> received this communication in error, please contact the sender
>>> >>> immediately
>>> >>> and delete it from your system. Thank You.
>>> >>>
>>> >>
>>>
>>> --
>>> CONFIDENTIALITY NOTICE
>>> NOTICE: This message is intended for the use of the individual or entity
>>> to
>>> which it is addressed and may contain information that is confidential,
>>> privileged and exempt from disclosure under applicable law. If the reader
>>> of this message is not the intended recipient, you are hereby notified
>>> that
>>> any printing, copying, dissemination, distribution, disclosure or
>>> forwarding of this communication is strictly prohibited. If you have
>>> received this communication in error, please contact the sender
>>> immediately
>>> and delete it from your system. Thank You.
>>>
>>
>>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Mime
View raw message