spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vadim Semenov <vadim.seme...@datadoghq.com>
Subject Re: Bizarre UI Behavior after migration
Date Sun, 10 Sep 2017 23:16:19 GMT
Was checking mails I sent, and wanted to get back to this one in case
someone gets the same question.

We found out that the reason why we saw stages being complete without all
tasks complete is related to issues in the ListenerBus

We had to tune the event queue size, see this
https://issues.apache.org/jira/browse/SPARK-15703

and we had to disable `eventLog` completely in some cases because of this
https://issues.apache.org/jira/browse/SPARK-21460

Facebook did some improvements to that, which are discussed here and in the
related PRs https://issues.apache.org/jira/browse/SPARK-18838

You can also see them discussing that at the Spark Summit SF 2017
https://www.youtube.com/watch?v=5dga0UT4RI8





On Mon, May 22, 2017 at 8:35 PM, Miles Crawford <milesc@allenai.org> wrote:

> Well, what's happening here is that jobs become "un-finished" - they
> complete, and then later on pop back into the "Active" section showing a
> small number of complete/inprogress tasks.
>
> In my screenshot, Job #1 completed as normal, and then later on switched
> back to active with only 92 tasks... it never seems to change again, it's
> stuck in this frozen, active state.
>
>
> On Mon, May 22, 2017 at 12:50 PM, Vadim Semenov <
> vadim.semenov@datadoghq.com> wrote:
>
>> I believe it shows only the tasks that have actually being executed, if
>> there were tasks with no data, they don't get reported.
>>
>> I might be mistaken, if somebody has a good explanation, would also like
>> to hear.
>>
>> On Fri, May 19, 2017 at 5:45 PM, Miles Crawford <milesc@allenai.org>
>> wrote:
>>
>>> Hey ya'll,
>>>
>>> Trying to migrate from Spark 1.6.1 to 2.1.0.
>>>
>>> I use EMR, and launched a new cluster using EMR 5.5, which runs spark
>>> 2.1.0.
>>>
>>> I updated my dependencies, and fixed a few API changes related to
>>> accumulators, and presto! my application was running on the new cluster.
>>>
>>> But the application UI shows crazy output:
>>> https://www.dropbox.com/s/egtj1056qeudswj/sparkwut.png?dl=0
>>>
>>> The applications seem to complete successfully, but I was wondering if
>>> anyone has an idea of what might be going wrong?
>>>
>>> Thanks,
>>> -Miles
>>>
>>
>>
>

Mime
View raw message