On Mon, Jan 20, 2014 at 11:05 PM, Ognen Duzlevski <email@example.com> wrote:Thanks. I will try that but your assumption is that something is failing in an obvious way with a message. By the look of the spark-shell - just frozen I would say something is "stuck". Will report back.
Given the suspicious nature of the "freezing" of the shell, it looked to me like a timeout or some kind of a "wait".
I whipped out tcpdump on a node in the cluster and noticed that the nodes try to connect back to master on some (random?) port. I realized that my VPC security group was too restrictive. As soon as I allowed all tcp and udp traffic within the VPC, it magically worked ;)
So, problem solved. It is not a bug after all, just traffic being blocked.In any case, I am documenting this as I go. As soon as I have a viable "data pipeline" in the VPC I will publish something for everyone to read, I figure another experience wouldn't hurt.