spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paul Brown <...@mult.ifario.us>
Subject Re: 0.9.0 forces log4j usage
Date Mon, 10 Feb 2014 07:35:18 GMT
Hi, Patrick --

I gave that a go locally, and it works as desired.

Best.
-- Paul

—
prb@mult.ifario.us | Multifarious, Inc. | http://mult.ifario.us/


On Fri, Feb 7, 2014 at 6:10 PM, Patrick Wendell <pwendell@gmail.com> wrote:

> Ah okay sounds good. This is what I meant earlier by "You have
> some other application that directly calls log4j."... i.e. you have
> for historical reasons installed the log4j-over-slf4j.
>
> Would you mind trying out this fix and seeing if it works? This is
> designed to be a hotfix for 0.9, not a general solution where we rip
> out log4j from our published dependencies:
>
> https://github.com/apache/incubator-spark/pull/560/files
>
> - Patrick
>
> On Fri, Feb 7, 2014 at 5:57 PM, Paul Brown <prb@mult.ifario.us> wrote:
> > Hi, Patrick --
> >
> > I forget which other component is responsible, but we're using the
> > log4j-over-slf4j as part of an overall requirement to centralize logging,
> > i.e., *someone* else is logging over log4j and we're pulling that in.
> >  (There's also some jul logging from Jersey, etc.)
> >
> > Goals:
> >
> > - Fully control/capture all possible logging.  (God forbid we have to
> grab
> > System.out/err, but we'd do it if needed.)
> > - Use the backend we like best at the moment.  (Happens to be logback.)
> >
> > Possible cases:
> >
> > - If Spark used Log4j at all, we would pull in that logging via
> > log4j-over-slf4j.
> > - If Spark used only slf4j and referenced no backend, we would use it
> as-is
> > although we'd still have the log4j-over-slf4j because of other libraries.
> > - If Spark used only slf4j and referenced the slf4j-log4j12 backend, we
> > would exclude that one dependency (via our POM).
> >
> > Best.
> > -- Paul
> >
> >
> > --
> > prb@mult.ifario.us | Multifarious, Inc. | http://mult.ifario.us/
> >
> >
> > On Fri, Feb 7, 2014 at 5:38 PM, Patrick Wendell <pwendell@gmail.com>
> wrote:
> >
> >> Hey Paul,
> >>
> >> So if your goal is ultimately to output to logback. Then why don't you
> >> just use slf4j and logback-classic.jar as described here [1]. Why
> >> involve log4j-over-slf4j at all?
> >>
> >> Let's say we refactored the spark build so it didn't advertise
> >> slf4j-log4j12 as a dependency. Would you still be using
> >> log4j-over-slf4j... or is this just a "fix" to deal with the fact that
> >> Spark is somewhat log4j dependent at this point.
> >>
> >> [1] http://www.slf4j.org/manual.html
> >>
> >> - Patrick
> >>
> >> On Fri, Feb 7, 2014 at 5:14 PM, Paul Brown <prb@mult.ifario.us> wrote:
> >> > Hi, Patrick --
> >> >
> >> > That's close but not quite it.
> >> >
> >> > The issue that occurs is not the delegation loop mentioned in slf4j
> >> > documentation.  The stack overflow is entirely within the code in the
> >> Spark
> >> > trait:
> >> >
> >> > at org.apache.spark.Logging$class.initializeLogging(Logging.scala:112)
> >> > at
> org.apache.spark.Logging$class.initializeIfNecessary(Logging.scala:97)
> >> > at org.apache.spark.Logging$class.log(Logging.scala:36)
> >> > at org.apache.spark.SparkEnv$.log(SparkEnv.scala:94)
> >> >
> >> >
> >> > And then that repeats.
> >> >
> >> > As for our situation, we exclude the slf4j-log4j12 dependency when we
> >> > import the Spark library (because we don't want to use log4j) and have
> >> > log4j-over-slf4j already in place to ensure that all of the logging in
> >> the
> >> > overall application runs through slf4j and then out through logback.
>  (We
> >> > also, as another poster already mentioned, also force jcl and jul
> through
> >> > slf4j.)
> >> >
> >> > The zen of slf4j for libraries is that the library uses the slf4j API
> and
> >> > then the enclosing application can route logging as it sees fit.
>  Spark
> >> > master CLI would log via slf4j and include the slf4j-log4j12 backend;
> >> same
> >> > for Spark worker CLI.  Spark as a library (versus as a container)
> would
> >> not
> >> > include any backend to the slf4j API and leave this up to the
> >> application.
> >> >  (FWIW, this would also avoid your log4j warning message.)
> >> >
> >> > But as I was saying before, I'd be happy with a situation where I can
> >> avoid
> >> > log4j being enabled or configured, and I think you'll find an existing
> >> > choice of logging framework to be a common scenario for those
> embedding
> >> > Spark in other systems.
> >> >
> >> > Best.
> >> > -- Paul
> >> >
> >> > --
> >> > prb@mult.ifario.us | Multifarious, Inc. | http://mult.ifario.us/
> >> >
> >> >
> >> > On Fri, Feb 7, 2014 at 3:01 PM, Patrick Wendell <pwendell@gmail.com>
> >> wrote:
> >> >
> >> >> Paul,
> >> >>
> >> >> Looking back at your problem. I think it's the one here:
> >> >> http://www.slf4j.org/codes.html#log4jDelegationLoop
> >> >>
> >> >> So let me just be clear what you are doing so I understand. You have
> >> >> some other application that directly calls log4j. So you have to
> >> >> include log4j-over-slf4j to route those logs through slf4j to
> logback.
> >> >>
> >> >> At the same time you embed Spark in this application. In the past it
> >> >> was fine, but now that Spark programmatic ally initializes log4j, it
> >> >> screws up your application because log4j-over-slf4j doesn't work with
> >> >> applications that do this explicilty as discussed here:
> >> >> http://www.slf4j.org/legacy.html
> >> >>
> >> >> Correct?
> >> >>
> >> >> - Patrick
> >> >>
> >> >> On Fri, Feb 7, 2014 at 2:02 PM, Koert Kuipers <koert@tresata.com>
> >> wrote:
> >> >> > got it. that sounds reasonable
> >> >> >
> >> >> >
> >> >> > On Fri, Feb 7, 2014 at 2:31 PM, Patrick Wendell <
> pwendell@gmail.com>
> >> >> wrote:
> >> >> >
> >> >> >> Koert - my suggestion was this. We let users use any slf4j
backend
> >> >> >> they want. If we detect that they are using the log4j backend
and
> >> >> >> *also* they didn't configure any log4j appenders, we set up
some
> nice
> >> >> >> defaults for them. If they are using another backend, Spark
> doesn't
> >> >> >> try to modify the configuration at all.
> >> >> >>
> >> >> >> On Fri, Feb 7, 2014 at 11:14 AM, Koert Kuipers <koert@tresata.com
> >
> >> >> wrote:
> >> >> >> > well "static binding" is probably the wrong terminology
but you
> get
> >> >> the
> >> >> >> > idea. multiple backends are not allowed and cause an
even uglier
> >> >> >> warning...
> >> >> >> >
> >> >> >> > see also here:
> >> >> >> > https://github.com/twitter/scalding/pull/636
> >> >> >> > and here:
> >> >> >> >
> https://groups.google.com/forum/#!topic/cascading-user/vYvnnN_15ls
> >> >> >> > all me being annoying and complaining about slf4j-log4j12
> >> dependencies
> >> >> >> > (which did get removed).
> >> >> >> >
> >> >> >> >
> >> >> >> > On Fri, Feb 7, 2014 at 2:09 PM, Koert Kuipers <
> koert@tresata.com>
> >> >> wrote:
> >> >> >> >
> >> >> >> >> the issue is that slf4j uses static binding. you
can put only
> one
> >> >> slf4j
> >> >> >> >> backend on the classpath, and that's what it uses.
more than
> one
> >> is
> >> >> not
> >> >> >> >> allowed.
> >> >> >> >>
> >> >> >> >> so you either keep the slf4j-log4j12 dependency for
spark, and
> >> then
> >> >> you
> >> >> >> >> took away people's choice of slf4j backend which
is considered
> bad
> >> >> form
> >> >> >> for
> >> >> >> >> a library, or you do not include it and then people
will always
> >> get
> >> >> the
> >> >> >> big
> >> >> >> >> fat ugly warning and slf4j logging will not flow
to log4j.
> >> >> >> >>
> >> >> >> >> including log4j itself is not necessary a problem
i think?
> >> >> >> >>
> >> >> >> >>
> >> >> >> >> On Fri, Feb 7, 2014 at 1:11 PM, Patrick Wendell <
> >> pwendell@gmail.com
> >> >> >> >wrote:
> >> >> >> >>
> >> >> >> >>> This also seems relevant - but not my area of
expertise
> (whether
> >> >> this
> >> >> >> >>> is a valid way to check this).
> >> >> >> >>>
> >> >> >> >>>
> >> >> >> >>>
> >> >> >>
> >> >>
> >>
> http://stackoverflow.com/questions/10505418/how-to-find-which-library-slf4j-has-bound-itself-to
> >> >> >> >>>
> >> >> >> >>> On Fri, Feb 7, 2014 at 10:08 AM, Patrick Wendell
<
> >> >> pwendell@gmail.com>
> >> >> >> >>> wrote:
> >> >> >> >>> > Hey Guys,
> >> >> >> >>> >
> >> >> >> >>> > Thanks for explainning. Ya this is a problem
- we didn't
> really
> >> >> know
> >> >> >> >>> > that people are using other slf4j backends,
slf4j is in
> there
> >> for
> >> >> >> >>> > historical reasons but I think we may assume
in a few places
> >> that
> >> >> >> >>> > log4j is being used and we should minimize
those.
> >> >> >> >>> >
> >> >> >> >>> > We should patch this and get a fix into
0.9.1. So some
> >> solutions I
> >> >> >> see
> >> >> >> >>> are:
> >> >> >> >>> >
> >> >> >> >>> > (a) Add SparkConf option to disable this.
I'm fine with this
> >> one.
> >> >> >> >>> >
> >> >> >> >>> > (b) Ask slf4j which backend is active and
only try to
> enforce
> >> this
> >> >> >> >>> > default if we know slf4j is using log4j.
Do either of you
> know
> >> if
> >> >> >> this
> >> >> >> >>> > is possible? Not sure if slf4j exposes this.
> >> >> >> >>> >
> >> >> >> >>> > (c) Just remove this default stuff. We'd
rather not do this.
> >> The
> >> >> goal
> >> >> >> >>> > of this thing is to provide good usability
for people who
> have
> >> >> linked
> >> >> >> >>> > against Spark and haven't done anything
to configure
> logging.
> >> For
> >> >> >> >>> > beginners we try to minimize the assumptions
about what else
> >> they
> >> >> >> know
> >> >> >> >>> > about, and I've found log4j configuration
is a huge mental
> >> barrier
> >> >> >> for
> >> >> >> >>> > people who are getting started.
> >> >> >> >>> >
> >> >> >> >>> > Paul if you submit a patch doing (a) we
can merge it in. If
> you
> >> >> have
> >> >> >> >>> > any idea if (b) is possible I prefer that
one, but it may
> not
> >> be
> >> >> >> >>> > possible or might be brittle.
> >> >> >> >>> >
> >> >> >> >>> > - Patrick
> >> >> >> >>> >
> >> >> >> >>> > On Fri, Feb 7, 2014 at 6:36 AM, Koert Kuipers
<
> >> koert@tresata.com>
> >> >> >> >>> wrote:
> >> >> >> >>> >> Totally agree with Paul: a library should
not pick the
> slf4j
> >> >> >> backend.
> >> >> >> >>> It
> >> >> >> >>> >> defeats the purpose of slf4j. That big
ugly warning is
> there
> >> to
> >> >> >> alert
> >> >> >> >>> >> people that its their responsibility
to pick the back
> end...
> >> >> >> >>> >> On Feb 7, 2014 3:55 AM, "Paul Brown"
<prb@mult.ifario.us>
> >> wrote:
> >> >> >> >>> >>
> >> >> >> >>> >>> Hi, Patrick --
> >> >> >> >>> >>>
> >> >> >> >>> >>> From slf4j, you can either backend
it into log4j (which is
> >> the
> >> >> way
> >> >> >> >>> that
> >> >> >> >>> >>> Spark is shipped) or you can route
log4j through slf4j and
> >> then
> >> >> on
> >> >> >> to
> >> >> >> >>> a
> >> >> >> >>> >>> different backend (e.g., logback).
 We're doing the latter
> >> and
> >> >> >> >>> manipulating
> >> >> >> >>> >>> the dependencies in the build because
that's the way the
> >> >> enclosing
> >> >> >> >>> >>> application is set up.
> >> >> >> >>> >>>
> >> >> >> >>> >>> The issue with the current situation
is that there's no
> way
> >> for
> >> >> an
> >> >> >> >>> end user
> >> >> >> >>> >>> to choose to *not* use the log4j
backend.  (My short-term
> >> >> solution
> >> >> >> >>> was to
> >> >> >> >>> >>> use the Maven shade plugin to swap
in a version of the
> >> Logging
> >> >> >> trait
> >> >> >> >>> with
> >> >> >> >>> >>> the body of that method commented
out.)  In addition to
> the
> >> >> >> situation
> >> >> >> >>> with
> >> >> >> >>> >>> log4j-over-slf4j and the empty enumeration
of ROOT
> appenders,
> >> >> you
> >> >> >> >>> might
> >> >> >> >>> >>> also run afoul of someone who intentionally
configured
> log4j
> >> >> with
> >> >> >> an
> >> >> >> >>> empty
> >> >> >> >>> >>> set of appenders at the time that
Spark is initializing.
> >> >> >> >>> >>>
> >> >> >> >>> >>> I'd be happy with any implementation
that lets me choose
> my
> >> >> logging
> >> >> >> >>> >>> backend: override default behavior
via system property,
> >> plug-in
> >> >> >> >>> >>> architecture, etc.  I do think it's
reasonable to expect
> >> someone
> >> >> >> >>> digesting
> >> >> >> >>> >>> a substantial JDK-based system like
Spark to understand
> how
> >> to
> >> >> >> >>> initialize
> >> >> >> >>> >>> logging -- surely they're using
logging of some kind
> >> elsewhere
> >> >> in
> >> >> >> >>> their
> >> >> >> >>> >>> application -- but if you want the
default behavior there
> as
> >> a
> >> >> >> >>> courtesy, it
> >> >> >> >>> >>> might be worth putting an INFO (versus
a the glaring log4j
> >> WARN)
> >> >> >> >>> message on
> >> >> >> >>> >>> the output that says something like
"Initialized default
> >> logging
> >> >> >> via
> >> >> >> >>> Log4J;
> >> >> >> >>> >>> pass -Dspark.logging.loadDefaultLogger=false
to disable
> this
> >> >> >> >>> behavior." so
> >> >> >> >>> >>> that it's both convenient and explicit.
> >> >> >> >>> >>>
> >> >> >> >>> >>> Cheers.
> >> >> >> >>> >>> -- Paul
> >> >> >> >>> >>>
> >> >> >> >>> >>>
> >> >> >> >>> >>>
> >> >> >> >>> >>>
> >> >> >> >>> >>>
> >> >> >> >>> >>>
> >> >> >> >>> >>> --
> >> >> >> >>> >>> prb@mult.ifario.us | Multifarious,
Inc. |
> >> >> http://mult.ifario.us/
> >> >> >> >>> >>>
> >> >> >> >>> >>>
> >> >> >> >>> >>> On Fri, Feb 7, 2014 at 12:05 AM,
Patrick Wendell <
> >> >> >> pwendell@gmail.com>
> >> >> >> >>> >>> wrote:
> >> >> >> >>> >>>
> >> >> >> >>> >>> > A config option e.g. could
just be to add:
> >> >> >> >>> >>> >
> >> >> >> >>> >>> > spark.logging.loadDefaultLogger
(default true)
> >> >> >> >>> >>> > If set to true, Spark will
try to initialize a log4j
> >> logger if
> >> >> >> none
> >> >> >> >>> is
> >> >> >> >>> >>> > detected. Otherwise Spark will
not modify logging
> behavior.
> >> >> >> >>> >>> >
> >> >> >> >>> >>> > Then users could just set this
to false if they have a
> >> logging
> >> >> >> >>> set-up
> >> >> >> >>> >>> > that conflicts with this.
> >> >> >> >>> >>> >
> >> >> >> >>> >>> > Maybe there is a nicer fix...
> >> >> >> >>> >>> >
> >> >> >> >>> >>> > On Fri, Feb 7, 2014 at 12:03
AM, Patrick Wendell <
> >> >> >> >>> pwendell@gmail.com>
> >> >> >> >>> >>> > wrote:
> >> >> >> >>> >>> > > Hey Paul,
> >> >> >> >>> >>> > >
> >> >> >> >>> >>> > > Thanks for digging this
up. I worked on this feature
> and
> >> the
> >> >> >> >>> intent
> >> >> >> >>> >>> > > was to give users good
default behavior if they didn't
> >> >> include
> >> >> >> any
> >> >> >> >>> >>> > > logging configuration
on the classpath.
> >> >> >> >>> >>> > >
> >> >> >> >>> >>> > > The problem with assuming
that CL tooling is going to
> fix
> >> >> the
> >> >> >> job
> >> >> >> >>> is
> >> >> >> >>> >>> > > that many people link
against spark as a library and
> run
> >> >> their
> >> >> >> >>> >>> > > application using their
own scripts. In this case the
> >> first
> >> >> >> thing
> >> >> >> >>> >>> > > people see when they run
an application that links
> >> against
> >> >> >> Spark
> >> >> >> >>> was a
> >> >> >> >>> >>> > > big ugly logging warning.
> >> >> >> >>> >>> > >
> >> >> >> >>> >>> > > I'm not super familiar
with log4j-over-slf4j, but this
> >> >> >> behavior of
> >> >> >> >>> >>> > > returning null for the
appenders seems a little weird.
> >> What
> >> >> is
> >> >> >> >>> the use
> >> >> >> >>> >>> > > case for using this and
not just directly use
> >> slf4j-log4j12
> >> >> >> like
> >> >> >> >>> Spark
> >> >> >> >>> >>> > > itself does?
> >> >> >> >>> >>> > >
> >> >> >> >>> >>> > > Did you have a more general
fix for this in mind? Or
> was
> >> >> your
> >> >> >> >>> plan to
> >> >> >> >>> >>> > > just revert the existing
behavior... We might be able
> to
> >> >> add a
> >> >> >> >>> >>> > > configuration option to
disable this logging default
> >> stuff.
> >> >> Or
> >> >> >> we
> >> >> >> >>> >>> > > could just rip it out
- but I'd like to avoid that if
> >> >> possible.
> >> >> >> >>> >>> > >
> >> >> >> >>> >>> > > - Patrick
> >> >> >> >>> >>> > >
> >> >> >> >>> >>> > > On Thu, Feb 6, 2014 at
11:41 PM, Paul Brown <
> >> >> >> prb@mult.ifario.us>
> >> >> >> >>> >>> wrote:
> >> >> >> >>> >>> > >> We have a few applications
that embed Spark, and in
> >> 0.8.0
> >> >> and
> >> >> >> >>> 0.8.1,
> >> >> >> >>> >>> we
> >> >> >> >>> >>> > >> were able to use slf4j,
but 0.9.0 broke that and
> >> >> >> unintentionally
> >> >> >> >>> >>> forces
> >> >> >> >>> >>> > >> direct use of log4j
as the logging backend.
> >> >> >> >>> >>> > >>
> >> >> >> >>> >>> > >> The issue is here
in the org.apache.spark.Logging
> trait:
> >> >> >> >>> >>> > >>
> >> >> >> >>> >>> > >>
> >> >> >> >>> >>> >
> >> >> >> >>> >>>
> >> >> >> >>>
> >> >> >>
> >> >>
> >>
> https://github.com/apache/incubator-spark/blame/master/core/src/main/scala/org/apache/spark/Logging.scala#L107
> >> >> >> >>> >>> > >>
> >> >> >> >>> >>> > >> log4j-over-slf4j *always*
returns an empty
> enumeration
> >> for
> >> >> >> >>> appenders
> >> >> >> >>> >>> to
> >> >> >> >>> >>> > the
> >> >> >> >>> >>> > >> ROOT logger:
> >> >> >> >>> >>> > >>
> >> >> >> >>> >>> > >>
> >> >> >> >>> >>> >
> >> >> >> >>> >>>
> >> >> >> >>>
> >> >> >>
> >> >>
> >>
> https://github.com/qos-ch/slf4j/blob/master/log4j-over-slf4j/src/main/java/org/apache/log4j/Category.java?source=c#L81
> >> >> >> >>> >>> > >>
> >> >> >> >>> >>> > >> And this causes an
infinite loop and an eventual
> stack
> >> >> >> overflow.
> >> >> >> >>> >>> > >>
> >> >> >> >>> >>> > >> I'm happy to submit
a Jira and a patch, but it would
> be
> >> >> >> >>> significant
> >> >> >> >>> >>> > enough
> >> >> >> >>> >>> > >> reversal of recent
changes that it's probably worth
> >> >> discussing
> >> >> >> >>> before
> >> >> >> >>> >>> I
> >> >> >> >>> >>> > >> sink a half hour into
it.  My suggestion would be
> that
> >> >> >> >>> initialization
> >> >> >> >>> >>> > (or
> >> >> >> >>> >>> > >> not) should be left
to the user with reasonable
> default
> >> >> >> behavior
> >> >> >> >>> >>> > supplied
> >> >> >> >>> >>> > >> by the spark commandline
tooling and not forced on
> >> >> >> applications
> >> >> >> >>> that
> >> >> >> >>> >>> > >> incorporate Spark.
> >> >> >> >>> >>> > >>
> >> >> >> >>> >>> > >> Thoughts/opinions?
> >> >> >> >>> >>> > >>
> >> >> >> >>> >>> > >> -- Paul
> >> >> >> >>> >>> > >> --
> >> >> >> >>> >>> > >> prb@mult.ifario.us
| Multifarious, Inc. |
> >> >> >> http://mult.ifario.us/
> >> >> >> >>> >>> >
> >> >> >> >>> >>>
> >> >> >> >>>
> >> >> >> >>
> >> >> >> >>
> >> >> >>
> >> >>
> >>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message