spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nicholas Chammas <nicholas.cham...@gmail.com>
Subject Re: spark 1.2 ec2 launch script hang
Date Thu, 29 Jan 2015 02:45:19 GMT
If that was indeed the problem, I suggest updating your answer on SO
<http://stackoverflow.com/a/28005151/877069> to help others who may run
into this same problem.
​

On Wed Jan 28 2015 at 9:40:39 PM Nicholas Chammas <
nicholas.chammas@gmail.com> wrote:

> Thanks for sending this over, Peter.
>
> What if you try this? (i.e. Remove the = after --identity-file.)
>
> ec2/spark-ec2 --key-pair=spark-streaming-kp --identity-file ~/.pzkeys/spark-streaming-kp.pem
 --region=us-east-1 login pz-spark-cluster
>
> If that works, then I think the problem in this case is simply that Bash
> cannot expand the tilde because it’s stuck to the --identity-file=. This
> isn’t a problem with spark-ec2.
>
> Bash sees the --identity-file=~/.pzkeys/spark-streaming-kp.pem as one big
> argument, so it can’t do tilde expansion.
>
> Nick
> ​
>
> On Wed Jan 28 2015 at 9:17:06 PM Peter Zybrick <pzybrick@gmail.com> wrote:
>
>> Below is trace from trying to access with ~/path.  I also did the echo as
>> per Nick (see the last line), looks ok to me.  This is my development box
>> with Spark 1.2.0 running CentOS 6.5, Python 2.6.6
>>
>> [pete.zybrick@pz-lt2-ipc spark-1.2.0]$ ec2/spark-ec2
>> --key-pair=spark-streaming-kp --identity-file=~/.pzkeys/spark-streaming-kp.pem
>> --region=us-east-1 login pz-spark-cluster
>> Searching for existing cluster pz-spark-cluster...
>> Found 1 master(s), 3 slaves
>> Logging into master ec2-54-152-95-129.compute-1.amazonaws.com...
>> Warning: Identity file ~/.pzkeys/spark-streaming-kp.pem not accessible:
>> No such file or directory.
>> Permission denied (publickey).
>> Traceback (most recent call last):
>>   File "ec2/spark_ec2.py", line 1082, in <module>
>>     main()
>>   File "ec2/spark_ec2.py", line 1074, in main
>>     real_main()
>>   File "ec2/spark_ec2.py", line 1007, in real_main
>>     ssh_command(opts) + proxy_opt + ['-t', '-t', "%s@%s" % (opts.user,
>> master)])
>>   File "/usr/lib64/python2.6/subprocess.py", line 505, in check_call
>>     raise CalledProcessError(retcode, cmd)
>> subprocess.CalledProcessError: Command '['ssh', '-o',
>> 'StrictHostKeyChecking=no', '-i', '~/.pzkeys/spark-streaming-kp.pem',
>> '-t', '-t', u'root@ec2-54-152-95-129.compute-1.amazonaws.com']' returned
>> non-zero exit status 255
>> [pete.zybrick@pz-lt2-ipc spark-1.2.0]$ echo ~/.pzkeys/spark-streaming-kp.
>> pem
>> /home/pete.zybrick/.pzkeys/spark-streaming-kp.pem
>>
>>
>> On Wed, Jan 28, 2015 at 3:49 PM, Charles Feduke <charles.feduke@gmail.com
>> > wrote:
>>
>>> Yeah, I agree ~ should work. And it could have been [read: probably was]
>>> the fact that one of the EC2 hosts was in my known_hosts (don't know, never
>>> saw an error message, but the behavior is no error message for that state),
>>> which I had fixed later with Pete's patch. But the second execution when
>>> things worked with an absolute path could have worked because the random
>>> hosts that came up on EC2 were never in my known_hosts.
>>>
>>>
>>> On Wed Jan 28 2015 at 3:45:36 PM Nicholas Chammas <
>>> nicholas.chammas@gmail.com> wrote:
>>>
>>>> Hmm, I can’t see why using ~ would be problematic, especially if you
>>>> confirm that echo ~/path/to/pem expands to the correct path to your
>>>> identity file.
>>>>
>>>> If you have a simple reproduction of the problem, please send it over.
>>>> I’d love to look into this. When I pass paths with ~ to spark-ec2 on my
>>>> system, it works fine. I’m using bash, but zsh handles tilde expansion
the
>>>> same as bash.
>>>>
>>>> Nick
>>>> ​
>>>>
>>>> On Wed Jan 28 2015 at 3:30:08 PM Charles Feduke <
>>>> charles.feduke@gmail.com> wrote:
>>>>
>>>>> It was only hanging when I specified the path with ~ I never tried
>>>>> relative.
>>>>>
>>>>> Hanging on the waiting for ssh to be ready on all hosts. I let it sit
>>>>> for about 10 minutes then I found the StackOverflow answer that suggested
>>>>> specifying an absolute path, cancelled, and re-run with --resume and
the
>>>>> absolute path and all slaves were up in a couple minutes.
>>>>>
>>>>> (I've stood up 4 integration clusters and 2 production clusters on EC2
>>>>> since with no problems.)
>>>>>
>>>>> On Wed Jan 28 2015 at 12:05:43 PM Nicholas Chammas <
>>>>> nicholas.chammas@gmail.com> wrote:
>>>>>
>>>>>> Ey-chih,
>>>>>>
>>>>>> That makes more sense. This is a known issue that will be fixed as
>>>>>> part of SPARK-5242 <https://issues.apache.org/jira/browse/SPARK-5242>
>>>>>> .
>>>>>>
>>>>>> Charles,
>>>>>>
>>>>>> Thanks for the info. In your case, when does spark-ec2 hang? Only
>>>>>> when the specified path to the identity file doesn't exist? Or also
when
>>>>>> you specify the path as a relative path or with ~?
>>>>>>
>>>>>> Nick
>>>>>>
>>>>>>
>>>>>> On Wed Jan 28 2015 at 9:29:34 AM ey-chih chow <eychih@hotmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> We found the problem and already fixed it.  Basically, spark-ec2
>>>>>>> requires ec2 instances to have external ip addresses. You need
to specify
>>>>>>> this in the ASW console.
>>>>>>> ------------------------------
>>>>>>> From: nicholas.chammas@gmail.com
>>>>>>> Date: Tue, 27 Jan 2015 17:19:21 +0000
>>>>>>> Subject: Re: spark 1.2 ec2 launch script hang
>>>>>>> To: charles.feduke@gmail.com; pzybrick@gmail.com; eychih@hotmail.com
>>>>>>> CC: user@spark.apache.org
>>>>>>>
>>>>>>>
>>>>>>> For those who found that absolute vs. relative path for the pem
file
>>>>>>> mattered, what OS and shell are you using? What version of Spark
are you
>>>>>>> using?
>>>>>>>
>>>>>>> ~/ vs. absolute path shouldn’t matter. Your shell will expand
the ~/
>>>>>>> to the absolute path before sending it to spark-ec2. (i.e. tilde
>>>>>>> expansion.)
>>>>>>>
>>>>>>> Absolute vs. relative path (e.g. ../../path/to/pem) also shouldn’t
>>>>>>> matter, since we fixed that for Spark 1.2.0
>>>>>>> <https://issues.apache.org/jira/browse/SPARK-4137>. Maybe
there’s
>>>>>>> some case that we missed?
>>>>>>>
>>>>>>> Nick
>>>>>>>
>>>>>>> On Tue Jan 27 2015 at 10:10:29 AM Charles Feduke <
>>>>>>> charles.feduke@gmail.com> wrote:
>>>>>>>
>>>>>>>
>>>>>>> Absolute path means no ~ and also verify that you have the path
to
>>>>>>> the file correct. For some reason the Python code does not validate
that
>>>>>>> the file exists and will hang (this is the same reason why ~
hangs).
>>>>>>> On Mon, Jan 26, 2015 at 10:08 PM Pete Zybrick <pzybrick@gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>> Try using an absolute path to the pem file
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> > On Jan 26, 2015, at 8:57 PM, ey-chih chow <eychih@hotmail.com>
>>>>>>> wrote:
>>>>>>> >
>>>>>>> > Hi,
>>>>>>> >
>>>>>>> > I used the spark-ec2 script of spark 1.2 to launch a cluster.
 I
>>>>>>> have
>>>>>>> > modified the script according to
>>>>>>> >
>>>>>>> > https://github.com/grzegorz-dubicki/spark/commit/5dd8458d2ab
>>>>>>> 9753aae939b3bb33be953e2c13a70
>>>>>>> >
>>>>>>> > But the script was still hung at the following message:
>>>>>>> >
>>>>>>> > Waiting for cluster to enter 'ssh-ready'
>>>>>>> > state.............................................
>>>>>>> >
>>>>>>> > Any additional thing I should do to make it succeed?  Thanks.
>>>>>>> >
>>>>>>> >
>>>>>>> > Ey-Chih Chow
>>>>>>> >
>>>>>>> >
>>>>>>> >
>>>>>>> > --
>>>>>>> > View this message in context: http://apache-spark-user-list.
>>>>>>> 1001560.n3.nabble.com/spark-1-2-ec2-launch-script-hang-tp21381.html
>>>>>>> > Sent from the Apache Spark User List mailing list archive
at
>>>>>>> Nabble.com.
>>>>>>> >
>>>>>>> > ------------------------------------------------------------
>>>>>>> ---------
>>>>>>> > To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
>>>>>>> > For additional commands, e-mail: user-help@spark.apache.org
>>>>>>> >
>>>>>>>
>>>>>>> ------------------------------------------------------------
>>>>>>> ---------
>>>>>>> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
>>>>>>> For additional commands, e-mail: user-help@spark.apache.org
>>>>>>>
>>>>>>>
>>>>>>> ​
>>>>>>>
>>>>>>
>>

Mime
View raw message