tez-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Solal Pirelli (JIRA)" <j...@apache.org>
Subject [jira] [Created] (TEZ-3841) Proposal: Simulator mode
Date Thu, 21 Sep 2017 17:53:00 GMT
Solal Pirelli created TEZ-3841:

             Summary: Proposal: Simulator mode
                 Key: TEZ-3841
                 URL: https://issues.apache.org/jira/browse/TEZ-3841
             Project: Apache Tez
          Issue Type: New Feature
            Reporter: Solal Pirelli

Early work on a new feature proposal: a "simulator" mode in which vertices are not actually
executed, but instead use a simplified "fake" processor (which is configurable, and by default
does nothing) to let a developer see how certain workloads will be handled.

For instance, one might want to check what happens if a vertex has each of its 1000 tasks
send a bunch of events - does this scale? Or, what if a specific vertex fails 2% of the time
- how does this impact overall graph execution? Are 2 nodes with 10 containers per node enough,
or should one invest in a third node?

My current implementation is pretty simple: mimic the "uber" stuff to add a new "fake" mode
with a custom task scheduler and container launcher. It adds the following configuration values:
* Boolean to enable fake mode
* Number of nodes in fake mode
* Number of containers per mode in fake mode
* Class to run in fake mode - must inherit a new class `FakeProcessor`, with a single method
`run` that takes the vertex name, task index and task attempt, and returns a list of events.
Throwing an exception causes the task to fail.

I'm currently working on adding a "chaos monkey" kind of service which randomly kills tasks,
pre-empts containers, etc., but would appreciate some feedback on what's already done first.

P.S.: I have zero experience with using JIRA or contributing to Apache projects; if there
is a more formal procedure for suggesting a new feature, please point me to it.

This message was sent by Atlassian JIRA

View raw message