spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mark Hamstra <m...@clearstorydata.com>
Subject Re: Getting spark job progress programmatically
Date Wed, 19 Nov 2014 16:12:16 GMT
This is already being covered by SPARK-2321 and SPARK-4145.  There are pull
requests that are already merged or already very far along -- e.g.,
https://github.com/apache/spark/pull/3009

If there is anything that needs to be added, please add it to those issues
or PRs.

On Wed, Nov 19, 2014 at 7:55 AM, Aniket Bhatnagar <
aniket.bhatnagar@gmail.com> wrote:

> I have for now submitted a JIRA ticket @
> https://issues.apache.org/jira/browse/SPARK-4473. I will collate all my
> experiences (& hacks) and submit them as a feature request for public API.
>
> On Tue Nov 18 2014 at 20:35:00 andy petrella <andy.petrella@gmail.com>
> wrote:
>
>> yep, we should also propose to add this stuffs in the public API.
>>
>> Any other ideas?
>>
>> On Tue Nov 18 2014 at 4:03:35 PM Aniket Bhatnagar <
>> aniket.bhatnagar@gmail.com> wrote:
>>
>>> Thanks Andy. This is very useful. This gives me all active stages &
>>> their percentage completion but I am unable to tie stages to job group (or
>>> specific job). I looked at Spark's code and to me, it
>>> seems org.apache.spark.scheduler.ActiveJob's group ID should get propagated
>>> to StageInfo (possibly in the StageInfo.fromStage method). For now, I will
>>> have to write my own version JobProgressListener that stores stageId to
>>> group Id mapping.
>>>
>>> I will submit a JIRA ticket and seek spark dev's opinion on this. Many
>>> thanks for your prompt help Andy.
>>>
>>> Thanks,
>>> Aniket
>>>
>>>
>>> On Tue Nov 18 2014 at 19:40:06 andy petrella <andy.petrella@gmail.com>
>>> wrote:
>>>
>>>> I started some quick hack for that in the notebook, you can head to:
>>>> https://github.com/andypetrella/spark-notebook/
>>>> blob/master/common/src/main/scala/notebook/front/widgets/
>>>> SparkInfo.scala
>>>>
>>>> On Tue Nov 18 2014 at 2:44:48 PM Aniket Bhatnagar <
>>>> aniket.bhatnagar@gmail.com> wrote:
>>>>
>>>>> I am writing yet another Spark job server and have been able to submit
>>>>> jobs and return/save results. I let multiple jobs use the same spark
>>>>> context but I set job group while firing each job so that I can in future
>>>>> cancel jobs. Further, what I deserve to do is provide some kind of status
>>>>> update/progress on running jobs (a % completion but be awesome) but I
am
>>>>> unable to figure out appropriate spark API to use. I do however see status
>>>>> reporting in spark UI so there must be a way to get status of various
>>>>> stages per job group. Any hints on what APIs should I look at?
>>>>
>>>>

Mime
View raw message