tez-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jonathan Eagles <jeag...@gmail.com>
Subject Re: Ordering requirements in DataMovementEvent and InputFailedEvent and implication on setting inputIsReady
Date Wed, 14 Mar 2018 18:46:12 GMT

I have spent some time looking over the code snippet posted and will try my
best to address your questions. But I do have a foundational question that
will help guide our continued conversations.

- What role and feature benefits do you see Tez taking on and fulfilling
for Scope? Pig and Hive have taken the approach that Tez will be use both
construct the the DAG to and to provide the runtime execution for which to
run the DAG in a YARN environment running on a hadoop FileSystem API
compatible file system (like HDFS).

In your reference example, I see you are overriding AbstractLogicalInput.
This is an approach that would be used to make ScopeInput a native type to
Tez as opposed to using input and output plugins to read scope input
without changing Tez code base. I wonder if this is intentional or not. If
so, I apologize for the lacking documentation that might have led you
astray. By attempting to make ScopeInput a native type to Tez, it takes a
direction different than Pig or Hive (or Flink or Cascading for that
matter). From an integration perspective, this places a very large amount
of work of developers working on Scope integration as well as possibly
making contributions troublesome.

Once we understand the environment and role Tez is to play in Scope, we
(the community) would be happy to help guide you towards Scope integration.

For reference, here are the links to Pig and Hive. From there have a look
These code snippets below (as well as Tez) are Apache License v2 so please
only read if that is possible.
Hive also works in a mixed YARN/LLAP runtime environment. That might or
might now be the best example to look at unless it matches the Scope case
as well. Posting only the non-llap code path for example.


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message