spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Josh Rosen <rosenvi...@gmail.com>
Subject Re: Recent Git Builds Application WebUI Problem and Exception Stating "Log directory /tmp/spark-events does not exist."
Date Mon, 19 Jan 2015 01:21:08 GMT
This looks like a bug in the master branch of Spark, related to some recent
changes to EventLoggingListener.  You can reproduce this bug on a fresh
Spark checkout by running

./bin/spark-shell --conf spark.eventLog.enabled=true --conf
spark.eventLog.dir=/tmp/nonexistent-dir

where /tmp/nonexistent-dir is a directory that doesn't exist and /tmp
exists.

It looks like older versions of EventLoggingListener would create the
directory if it didn't exist.  I think the issue here is that the
error-checking code is overzealous and catches some non-error conditions,
too; I've filed https://issues.apache.org/jira/browse/SPARK-5311 to
investigate this.

On Sun, Jan 18, 2015 at 1:59 PM, Ganon Pierce <ganon.pierce@me.com> wrote:

> I posted about the Application WebUI error (specifically application WebUI
> not the master WebUI generally) and have spent at least a few hours a day
> for over week trying to resolve it so I’d be very grateful for any
> suggestions. It is quite troubling that I appear to be the only one
> encountering this issue and I’ve tried to include everything here which
> might be relevant (sorry for the length). Please see the thread "Current
> Build Gives HTTP ERROR”
> https://www.mail-archive.com/user@spark.apache.org/msg18752.html to see
> specifics about the application webUI issue and the master log.
>
>
> Environment:
>
> I’m doing my spark builds and application programming in scala locally on
> my macbook pro in eclipse, using modified ec2 launch scripts to launch my
> cluster, uploading my spark builds and models to s3, and uploading
> applications to and submitting them from ec2. I’m using java 8 locally and
> also installing and using java 8 on my ec2 instances (which works with
> spark 1.2.0). I have a windows machine at home (macbook is work machine),
> but have not yet attempted to launch from there.
>
>
> Errors:
>
> I’ve built two different recent git versions of spark both multiple times,
> and when running applications both have produced an Application WebUI error
> and this exception:
>
> Exception in thread "main" java.lang.IllegalArgumentException: Log
> directory /tmp/spark-events does not exist.
>
> While both will display the master webUI just fine including
> running/completed applications, registered workers etc, when I try to
> access a running or completed application’s WebUI by clicking their
> respective link, I receive a server error. When I manually create the above
> log directory, the exception goes away, but the WebUI problem does not. I
> don’t have any strong evidence, but I suspect these errors and whatever is
> causing them are related.
>
>
> Why and How of Modifications to Launch Scripts for Installation of
> Unreleased Spark Versions:
>
> When using a prebuilt version of spark on my cluster everything works
> except the new methods I need, which I had previously added to my custom
> version of spark and used by building the spark-assembly.jar locally and
> then replacing the assembly file produced through the 1.1.0 ec2 launch
> scripts. However, since my pull request was accepted and can now be found
> in the apache/spark repository along with some additional features I’d like
> to use and because I’d like a more elegant permanent solution for launching
> a cluster and installing unreleased versions of spark to my ec2 clusters,
> I’ve modified the included ec2 launch scripts in this way (credit to gen
> tang here:
> https://www.mail-archive.com/user%40spark.apache.org/msg18761.html
> <https://www.mail-archive.com/user@spark.apache.org/msg18761.html>):
>
> 1. Clone the most recent git version of spark
> 2. Use the make-dist script
> 3. Tar the dist folder and upload the resulting
> spark-1.3.0-snapshot-hadoop1.tgz to s3 and change file permissions
> 4. Fork the mesos/spark-ec2 repository and modify the spark/init.sh script
> to do a wget of my hosted distribution instead of spark’s stable release
> 5. Modify my spark_ec2.py script to point to my repository.
> 6. Modify my spark_ec2.py script to install java 8 on my ec2 instances.
> (This works and does not produce the above stated errors when using a
> stable release like 1.2.0).
>
>
> Additional Possibly Related Info:
>
> As far as I can tell (I went through line by line), when I launch my
> recent build vs when I launch the most recent stable release the console
> prints almost identical INFO and WARNINGS except where you would expect
> things to be different e.g. version numbers. I’ve noted that after launch
> the prebuilt stable version does not have a /tmp/spark-events directory,
> but it is created when the application is launched, while it is never
> created in my build. Further, in my unreleased builds the application logs
> that I find are always stored as .inprogress files (when I set the logging
> directory to /root/ or add the /tmp/spark-events directory manually) even
> after completion, which I believe is supposed to change to .completed (or
> something similar) when the application finishes.
>
>
> Thanks for any help!
>
>

Mime
View raw message