spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ganon Pierce <ganon.pie...@me.com>
Subject Recent Git Builds Application WebUI Problem and Exception Stating "Log directory /tmp/spark-events does not exist."
Date Sun, 18 Jan 2015 21:59:40 GMT
I posted about the Application WebUI error (specifically application WebUI not the master WebUI
generally) and have spent at least a few hours a day for over week trying to resolve it so
I’d be very grateful for any suggestions. It is quite troubling that I appear to be the
only one encountering this issue and I’ve tried to include everything here which might be
relevant (sorry for the length). Please see the thread "Current Build Gives HTTP ERROR”
https://www.mail-archive.com/user@spark.apache.org/msg18752.html <https://www.mail-archive.com/user@spark.apache.org/msg18752.html>
to see specifics about the application webUI issue and the master log.


Environment:

I’m doing my spark builds and application programming in scala locally on my macbook pro
in eclipse, using modified ec2 launch scripts to launch my cluster, uploading my spark builds
and models to s3, and uploading applications to and submitting them from ec2. I’m using
java 8 locally and also installing and using java 8 on my ec2 instances (which works with
spark 1.2.0). I have a windows machine at home (macbook is work machine), but have not yet
attempted to launch from there.


Errors:

I’ve built two different recent git versions of spark both multiple times, and when running
applications both have produced an Application WebUI error and this exception: 

Exception in thread "main" java.lang.IllegalArgumentException: Log directory /tmp/spark-events
does not exist.

While both will display the master webUI just fine including running/completed applications,
registered workers etc, when I try to access a running or completed application’s WebUI
by clicking their respective link, I receive a server error. When I manually create the above
log directory, the exception goes away, but the WebUI problem does not. I don’t have any
strong evidence, but I suspect these errors and whatever is causing them are related. 


Why and How of Modifications to Launch Scripts for Installation of Unreleased Spark Versions:

When using a prebuilt version of spark on my cluster everything works except the new methods
I need, which I had previously added to my custom version of spark and used by building the
spark-assembly.jar locally and then replacing the assembly file produced through the 1.1.0
ec2 launch scripts. However, since my pull request was accepted and can now be found in the
apache/spark repository along with some additional features I’d like to use and because
I’d like a more elegant permanent solution for launching a cluster and installing unreleased
versions of spark to my ec2 clusters, I’ve modified the included ec2 launch scripts in this
way (credit to gen tang here: https://www.mail-archive.com/user%40spark.apache.org/msg18761.html
<https://www.mail-archive.com/user@spark.apache.org/msg18761.html>):

1. Clone the most recent git version of spark
2. Use the make-dist script 
3. Tar the dist folder and upload the resulting spark-1.3.0-snapshot-hadoop1.tgz to s3 and
change file permissions
4. Fork the mesos/spark-ec2 repository and modify the spark/init.sh script to do a wget of
my hosted distribution instead of spark’s stable release
5. Modify my spark_ec2.py script to point to my repository.
6. Modify my spark_ec2.py script to install java 8 on my ec2 instances. (This works and does
not produce the above stated errors when using a stable release like 1.2.0).


Additional Possibly Related Info:

As far as I can tell (I went through line by line), when I launch my recent build vs when
I launch the most recent stable release the console prints almost identical INFO and WARNINGS
except where you would expect things to be different e.g. version numbers. I’ve noted that
after launch the prebuilt stable version does not have a /tmp/spark-events directory, but
it is created when the application is launched, while it is never created in my build. Further,
in my unreleased builds the application logs that I find are always stored as .inprogress
files (when I set the logging directory to /root/ or add the /tmp/spark-events directory manually)
even after completion, which I believe is supposed to change to .completed (or something similar)
when the application finishes.


Thanks for any help!


Mime
View raw message