spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jeremy Lee <>
Subject Yay for 1.0.0! EC2 Still has problems.
Date Fri, 30 May 2014 12:08:15 GMT
Hi there! I'm relatively new to the list, so sorry if this is a repeat:

I just wanted to mention there are still problems with the EC2 scripts.
Basically, they don't work.

First, if you run the scripts on Amazon's own suggested version of linux,
they break because amazon installs Python2.6.9, and the scripts use a
couple of Python2.7 commands. I have to "sudo yum install python27", and
then edit the spark-ec2 shell script to use that specific version.
Annoying, but minor.

(the base "python" command isn't upgraded to 2.7 on many systems,
apparently because it would break yum)

The second minor problem is that the script doesn't know about the
"r3.large" servers... also easily fixed by adding to the
script. Minor,

The big problem is that after the EC2 cluster is provisioned, installed,
set up, and everything, it fails to start up the webserver on the master.
Here's the tail of the log:

Starting GANGLIA gmond:                                    [  OK  ]
Shutting down GANGLIA gmond:                               [FAILED]
Starting GANGLIA gmond:                                    [  OK  ]
Connection to closed.
Shutting down GANGLIA gmond:                               [FAILED]
Starting GANGLIA gmond:                                    [  OK  ]
Connection to closed.
Shutting down GANGLIA gmetad:                              [FAILED]
Starting GANGLIA gmetad:                                   [  OK  ]
Stopping httpd:                                            [FAILED]
Starting httpd: httpd: Syntax error on line 153 of
/etc/httpd/conf/httpd.conf: Cannot load modules/ into
server: /etc/httpd/modules/ cannot open shared object
file: No such file or directory

Basically, the AMI you have chosen does not seem to have a "full" install
of apache, and is missing several modules that are referred to in the
httpd.conf file that is installed. The full list of missing modules is:

authn_alias_module modules/
authn_default_module modules/
authz_default_module modules/
ldap_module modules/
authnz_ldap_module modules/
disk_cache_module modules/

Alas, even if these modules are commented out, the server still fails to

root@ip-172-31-11-193 ~]$ service httpd start
Starting httpd: AH00534: httpd: Configuration error: No MPM loaded.

That means Spark 1.0.0 clusters on EC2 are Dead-On-Arrival when run
according to the instructions. Sorry.

Any suggestions on how to proceed? I'll keep trying to fix the webserver,
but (a) changes to httpd.conf get blown away by "resume", and (b) anything
I do has to be redone every time I provision another cluster. Ugh.

Jeremy Lee  BCompSci(Hons)
  The Unorthodox Engineers

View raw message