spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Nihar Sheth (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SPARK-24918) Executor Plugin API
Date Wed, 22 Aug 2018 18:48:00 GMT

    [ https://issues.apache.org/jira/browse/SPARK-24918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16589248#comment-16589248
] 

Nihar Sheth commented on SPARK-24918:
-------------------------------------

[~irashid] has asked me to add testing to his PR. I'm not sure what the standard procedure
is, can I just open another PR with his changes and the tests?

> Executor Plugin API
> -------------------
>
>                 Key: SPARK-24918
>                 URL: https://issues.apache.org/jira/browse/SPARK-24918
>             Project: Spark
>          Issue Type: New Feature
>          Components: Spark Core
>    Affects Versions: 2.4.0
>            Reporter: Imran Rashid
>            Priority: Major
>              Labels: SPIP, memory-analysis
>
> It would be nice if we could specify an arbitrary class to run within each executor for
debugging and instrumentation.  Its hard to do this currently because:
> a) you have no idea when executors will come and go with DynamicAllocation, so don't
have a chance to run custom code before the first task
> b) even with static allocation, you'd have to change the code of your spark app itself
to run a special task to "install" the plugin, which is often tough in production cases when
those maintaining regularly running applications might not even know how to make changes to
the application.
> For example, https://github.com/squito/spark-memory could be used in a debugging context
to understand memory use, just by re-running an application with extra command line arguments
(as opposed to rebuilding spark).
> I think one tricky part here is just deciding the api, and how its versioned.  Does it
just get created when the executor starts, and thats it?  Or does it get more specific events,
like task start, task end, etc?  Would we ever add more events?  It should definitely be a
{{DeveloperApi}}, so breaking compatibility would be allowed ... but still should be avoided.
 We could create a base class that has no-op implementations, or explicitly version everything.
> Note that this is not needed in the driver as we already have SparkListeners (even if
you don't care about the SparkListenerEvents and just want to inspect objects in the JVM,
its still good enough).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message