spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ryan Blue <rb...@netflix.com.INVALID>
Subject Re: [DISCUSS] Change default executor log URLs for YARN
Date Fri, 08 Feb 2019 20:41:28 GMT
Here's what I see from a running job on our cluster. Both of these are
links that go to the stderr and stdout links that Spark produces today.

stderr : Total file length is 18557 bytes.
stdout : Total file length is 0 bytes.

While it is nice to see that stderr or stdout has content, I don't think
that this is worth the extra click or changes to Spark.

However, we have configured our logs to go to stderr and stdout so these
links work for us. I think some YARN applications send logs to a separate
log endpoint, which would be useful when listed here. Does anyone have logs
going to locations other than stderr and stdout?

If there are logs going to other files, then I think making this an option
is reasonable. Otherwise, I think we should leave links as they are.

rb

On Thu, Feb 7, 2019 at 12:31 PM Jungtaek Lim <kabhwan@gmail.com> wrote:

> New URL shows all of local logs which includes stdout and stderr as a list.
>
> The change would help when end users modify their log4j configuration to
> have another log files, as well as GC logs. Currently Spark only shows two
> static files (stdout, stderr) as individual links so easier to see the
> content (one-click) but users have to remove file part manually from URL to
> access list page. Instead of this we may be able to change default URL to
> show all of local logs and let users choose which file to read. (though it
> would be two-clicks to access to actual file)
>
> -Jungtaek Lim (HeartSaVioR)
>
> 2019년 2월 8일 (금) 오전 1:33, Ryan Blue <rblue@netflix.com>님이 작성:
>
>> Jungtaek,
>>
>> What is shown at the new URL and how would this improve usability?
>>
>> On Thu, Feb 7, 2019 at 12:45 AM Jungtaek Lim <kabhwan@gmail.com> wrote:
>>
>>> Hi devs,
>>>
>>> Based on the suggestion Tom Graves gave me in SPARK-26792
>>> <https://issues.apache.org/jira/browse/SPARK-26792>, I'd like to hear
>>> voices on changing default executor log URLs for YARN, specifically
>>> removing "stdout" and "stderr" and provide link which shows log file"s".
>>> For example, instead of referring two links below:
>>>
>>> http://
>>> <NM_HOST>:<NM_PORT>/node/containerlogs/<CONTAINER_ID>/<USER>/<stdout|stderr>?start=-4096
>>>
>>> we just refer only one link below:
>>>
>>> http://<NM_HOST>:<NM_PORT>/node/containerlogs/<CONTAINER_ID>/<USER>
>>>
>>> I've checked new URL works with redirection on NM to jobhistory, so it
>>> won't break what we currently supported. Going through the actual log file
>>> would require two clicks instead of one click though.
>>>
>>> Given it introduces the change on UX I'd like to hear voices on this
>>> before submitting a patch. If we'd rather keep this as it is, I would just
>>> open the chance to apply custom log URL for Spark UI as well.
>>>
>>> Thanks in advance!
>>>
>>> FYI, below is the rationalization on discussion:
>>>
>>> While I worked regarding SPARK-23155
>>> <https://issues.apache.org/jira/browse/SPARK-23155>, I've got some
>>> inputs around linking "log directory" instead of log urls for each "stdout"
>>> and "stderr", because in real case end users would put more files then only
>>> stdout and stderr (like gc logs).
>>>
>>> SPARK-23155 provides the way to modify log URL but it's only applied to
>>> SHS, and in Spark UI in running apps it still only shows "stdout" and
>>> "stderr". SPARK-26792 is for applying this to Spark UI as well, but I've
>>> got suggestion to just change the default log URL.
>>>
>>> Thanks again,
>>> Jungtaek Lim (HeartSaVioR)
>>>
>>
>>
>> --
>> Ryan Blue
>> Software Engineer
>> Netflix
>>
>

-- 
Ryan Blue
Software Engineer
Netflix

Mime
View raw message