spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sean Owen <so...@cloudera.com>
Subject Re: [VOTE] Release Apache Spark 1.0.0 (rc5)
Date Mon, 02 Jun 2014 17:50:26 GMT
On Mon, Jun 2, 2014 at 6:05 PM, Marcelo Vanzin <vanzin@cloudera.com> wrote:
> You mentioned something in your shading argument that kinda reminded
> me of something. Spark currently depends on slf4j implementations and
> log4j with "compile" scope. I'd argue that's the wrong approach if
> we're talking about Spark being used embedded inside applications;
> Spark should only depend on the slf4j API package, and let the
> application provide the underlying implementation.

Good idea in general; in practice, the drawback is that you can't do
things like set log levels if you only depend on the SLF4J API. There
are a few cases where that's nice to control, and that's only possible
if you bind to a particular logger as well.

You typically bundle a SLF4J binding anyway, to give a default, or
else the end-user has to know to also bind some SLF4J logger to get
output. Of course it does make for a bit more surgery if you want to
override the binding this way.

Shading can bring a whole new level of confusion; I myself would only
use it where essential as a workaround. Same with trying to make more
elaborate custom classloading schemes -- never in my darkest
nightmares have I imagine the failure modes that probably pop up when
that goes wrong. I think the library collisions will get better over
time as only later versions of Hadoop are in scope, for example,
and/or one build system is in play. I like tackling complexity along
those lines first.

Mime
View raw message