spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Patrick Wendell <pwend...@gmail.com>
Subject Re: 0.9.0 forces log4j usage
Date Fri, 07 Feb 2014 18:11:15 GMT
This also seems relevant - but not my area of expertise (whether this
is a valid way to check this).

http://stackoverflow.com/questions/10505418/how-to-find-which-library-slf4j-has-bound-itself-to

On Fri, Feb 7, 2014 at 10:08 AM, Patrick Wendell <pwendell@gmail.com> wrote:
> Hey Guys,
>
> Thanks for explainning. Ya this is a problem - we didn't really know
> that people are using other slf4j backends, slf4j is in there for
> historical reasons but I think we may assume in a few places that
> log4j is being used and we should minimize those.
>
> We should patch this and get a fix into 0.9.1. So some solutions I see are:
>
> (a) Add SparkConf option to disable this. I'm fine with this one.
>
> (b) Ask slf4j which backend is active and only try to enforce this
> default if we know slf4j is using log4j. Do either of you know if this
> is possible? Not sure if slf4j exposes this.
>
> (c) Just remove this default stuff. We'd rather not do this. The goal
> of this thing is to provide good usability for people who have linked
> against Spark and haven't done anything to configure logging. For
> beginners we try to minimize the assumptions about what else they know
> about, and I've found log4j configuration is a huge mental barrier for
> people who are getting started.
>
> Paul if you submit a patch doing (a) we can merge it in. If you have
> any idea if (b) is possible I prefer that one, but it may not be
> possible or might be brittle.
>
> - Patrick
>
> On Fri, Feb 7, 2014 at 6:36 AM, Koert Kuipers <koert@tresata.com> wrote:
>> Totally agree with Paul: a library should not pick the slf4j backend. It
>> defeats the purpose of slf4j. That big ugly warning is there to alert
>> people that its their responsibility to pick the back end...
>> On Feb 7, 2014 3:55 AM, "Paul Brown" <prb@mult.ifario.us> wrote:
>>
>>> Hi, Patrick --
>>>
>>> From slf4j, you can either backend it into log4j (which is the way that
>>> Spark is shipped) or you can route log4j through slf4j and then on to a
>>> different backend (e.g., logback).  We're doing the latter and manipulating
>>> the dependencies in the build because that's the way the enclosing
>>> application is set up.
>>>
>>> The issue with the current situation is that there's no way for an end user
>>> to choose to *not* use the log4j backend.  (My short-term solution was to
>>> use the Maven shade plugin to swap in a version of the Logging trait with
>>> the body of that method commented out.)  In addition to the situation with
>>> log4j-over-slf4j and the empty enumeration of ROOT appenders, you might
>>> also run afoul of someone who intentionally configured log4j with an empty
>>> set of appenders at the time that Spark is initializing.
>>>
>>> I'd be happy with any implementation that lets me choose my logging
>>> backend: override default behavior via system property, plug-in
>>> architecture, etc.  I do think it's reasonable to expect someone digesting
>>> a substantial JDK-based system like Spark to understand how to initialize
>>> logging -- surely they're using logging of some kind elsewhere in their
>>> application -- but if you want the default behavior there as a courtesy, it
>>> might be worth putting an INFO (versus a the glaring log4j WARN) message on
>>> the output that says something like "Initialized default logging via Log4J;
>>> pass -Dspark.logging.loadDefaultLogger=false to disable this behavior." so
>>> that it's both convenient and explicit.
>>>
>>> Cheers.
>>> -- Paul
>>>
>>>
>>>
>>>
>>>
>>>
>>> --
>>> prb@mult.ifario.us | Multifarious, Inc. | http://mult.ifario.us/
>>>
>>>
>>> On Fri, Feb 7, 2014 at 12:05 AM, Patrick Wendell <pwendell@gmail.com>
>>> wrote:
>>>
>>> > A config option e.g. could just be to add:
>>> >
>>> > spark.logging.loadDefaultLogger (default true)
>>> > If set to true, Spark will try to initialize a log4j logger if none is
>>> > detected. Otherwise Spark will not modify logging behavior.
>>> >
>>> > Then users could just set this to false if they have a logging set-up
>>> > that conflicts with this.
>>> >
>>> > Maybe there is a nicer fix...
>>> >
>>> > On Fri, Feb 7, 2014 at 12:03 AM, Patrick Wendell <pwendell@gmail.com>
>>> > wrote:
>>> > > Hey Paul,
>>> > >
>>> > > Thanks for digging this up. I worked on this feature and the intent
>>> > > was to give users good default behavior if they didn't include any
>>> > > logging configuration on the classpath.
>>> > >
>>> > > The problem with assuming that CL tooling is going to fix the job is
>>> > > that many people link against spark as a library and run their
>>> > > application using their own scripts. In this case the first thing
>>> > > people see when they run an application that links against Spark was
a
>>> > > big ugly logging warning.
>>> > >
>>> > > I'm not super familiar with log4j-over-slf4j, but this behavior of
>>> > > returning null for the appenders seems a little weird. What is the
use
>>> > > case for using this and not just directly use slf4j-log4j12 like Spark
>>> > > itself does?
>>> > >
>>> > > Did you have a more general fix for this in mind? Or was your plan
to
>>> > > just revert the existing behavior... We might be able to add a
>>> > > configuration option to disable this logging default stuff. Or we
>>> > > could just rip it out - but I'd like to avoid that if possible.
>>> > >
>>> > > - Patrick
>>> > >
>>> > > On Thu, Feb 6, 2014 at 11:41 PM, Paul Brown <prb@mult.ifario.us>
>>> wrote:
>>> > >> We have a few applications that embed Spark, and in 0.8.0 and 0.8.1,
>>> we
>>> > >> were able to use slf4j, but 0.9.0 broke that and unintentionally
>>> forces
>>> > >> direct use of log4j as the logging backend.
>>> > >>
>>> > >> The issue is here in the org.apache.spark.Logging trait:
>>> > >>
>>> > >>
>>> >
>>> https://github.com/apache/incubator-spark/blame/master/core/src/main/scala/org/apache/spark/Logging.scala#L107
>>> > >>
>>> > >> log4j-over-slf4j *always* returns an empty enumeration for appenders
>>> to
>>> > the
>>> > >> ROOT logger:
>>> > >>
>>> > >>
>>> >
>>> https://github.com/qos-ch/slf4j/blob/master/log4j-over-slf4j/src/main/java/org/apache/log4j/Category.java?source=c#L81
>>> > >>
>>> > >> And this causes an infinite loop and an eventual stack overflow.
>>> > >>
>>> > >> I'm happy to submit a Jira and a patch, but it would be significant
>>> > enough
>>> > >> reversal of recent changes that it's probably worth discussing
before
>>> I
>>> > >> sink a half hour into it.  My suggestion would be that initialization
>>> > (or
>>> > >> not) should be left to the user with reasonable default behavior
>>> > supplied
>>> > >> by the spark commandline tooling and not forced on applications
that
>>> > >> incorporate Spark.
>>> > >>
>>> > >> Thoughts/opinions?
>>> > >>
>>> > >> -- Paul
>>> > >> --
>>> > >> prb@mult.ifario.us | Multifarious, Inc. | http://mult.ifario.us/
>>> >
>>>

Mime
View raw message