tez-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bikas Saha <bi...@hortonworks.com>
Subject RE: In-memory shuffle in Tez
Date Mon, 21 Apr 2014 16:35:53 GMT
That in-memory support was removed because the Tez team is exploring better
ways to achieve the same result. The issue with the removed approach was
that it would hold onto those memory resources on YARN containers for an
indeterminate amount of time and prevent optimal container release.

Tez, as of now, only supports the persisted edge property which implies data
persisted locally by the task. Streaming requires considerable support for
dependency management, coupling of producer consumer speeds and fault
tolerance and we haven't yet found a strong enough use case to make that
investment as of now. An Apache Samza like approach of heavy-weight
streaming, wherein the intermediate data is written to a store that can also
serve that data while its being written, seems like a reasonable approach
that gets most of the benefits of streaming while avoiding a lot of the
fault tolerance and producer-consumer speed matching issues of direct


-----Original Message-----
From: Manu Zhang [mailto:owenzhang1990@gmail.com]
Sent: Monday, April 21, 2014 1:55 AM
To: dev@tez.incubator.apache.org
Subject: In-memory shuffle in Tez


I used to develop on Tez-0.2.0 but haven't followed Tez-0.3.0 and Tez-0.4.0.
I remeber Tez used to have InMemorySortedOutput and InMemoryShuffleSorter
which didn't spill map output to file.
I found they were removed in
TEZ-791<https://issues.apache.org/jira/browse/TEZ-791> so I wonder whether
Tez still supports some sort of In-memory shuffle ?
A related question is whether there are any examples for Ephemeral
EdgeProperty where a source task may stream its outputs directly to
destination task.

Manu Zhang

NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

View raw message