spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From andy petrella <>
Subject Re: Getting spark job progress programmatically
Date Tue, 18 Nov 2014 15:04:59 GMT
yep, we should also propose to add this stuffs in the public API.

Any other ideas?

On Tue Nov 18 2014 at 4:03:35 PM Aniket Bhatnagar <> wrote:

> Thanks Andy. This is very useful. This gives me all active stages & their
> percentage completion but I am unable to tie stages to job group (or
> specific job). I looked at Spark's code and to me, it
> seems org.apache.spark.scheduler.ActiveJob's group ID should get propagated
> to StageInfo (possibly in the StageInfo.fromStage method). For now, I will
> have to write my own version JobProgressListener that stores stageId to
> group Id mapping.
> I will submit a JIRA ticket and seek spark dev's opinion on this. Many
> thanks for your prompt help Andy.
> Thanks,
> Aniket
> On Tue Nov 18 2014 at 19:40:06 andy petrella <>
> wrote:
>> I started some quick hack for that in the notebook, you can head to:
>> blob/master/common/src/main/scala/notebook/front/widgets/SparkInfo.scala
>> On Tue Nov 18 2014 at 2:44:48 PM Aniket Bhatnagar <
>>> wrote:
>>> I am writing yet another Spark job server and have been able to submit
>>> jobs and return/save results. I let multiple jobs use the same spark
>>> context but I set job group while firing each job so that I can in future
>>> cancel jobs. Further, what I deserve to do is provide some kind of status
>>> update/progress on running jobs (a % completion but be awesome) but I am
>>> unable to figure out appropriate spark API to use. I do however see status
>>> reporting in spark UI so there must be a way to get status of various
>>> stages per job group. Any hints on what APIs should I look at?

View raw message