spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Hokam Singh Chauhan <hokam.1...@gmail.com>
Subject Re: Problem of submitting Spark task to cluster from eclipse IDE on Windows
Date Thu, 24 Dec 2015 02:24:58 GMT
Hi,

Use spark://hostname:7077 as spark master if you are using IP address in
place of hostname.

I have faced the same issue, it got resolved by using hostname in spark
master instead of using IP address.

Regards,
Hokam
On 23 Dec 2015 13:41, "Akhil Das" <akhil@sigmoidanalytics.com> wrote:

> You need to:
>
> 1. Make sure your local router have NAT enabled and port forwarded the
> networking ports listed here
> <http://spark.apache.org/docs/latest/configuration.html#networking>.
> 2. Make sure on your clusters 7077 is accessible from your local (public)
> ip address. You can try telnet 10.20.17.70 7077
> 3. Set spark.driver.host so that the cluster can connect back to your
> machine.
>
>
>
> Thanks
> Best Regards
>
> On Wed, Dec 23, 2015 at 10:02 AM, superbee84 <holybee@qq.com> wrote:
>
>> Hi All,
>>
>>    I'm new to Spark. Before I describe the problem, I'd like to let you
>> know
>> the role of the machines that organize the cluster and the purpose of my
>> work. By reading and follwing the instructions and tutorials, I
>> successfully
>> built up a cluster with 7 CentOS-6.5 machines. I installed Hadoop 2.7.1,
>> Spark 1.5.1, Scala 2.10.4 and ZooKeeper 3.4.5 on them. The details are
>> listed as below:
>>
>>
>> Host Name  |  IP Address  |  Hadoop 2.7.1         | Spark 1.5.1        |
>> ZooKeeper
>> hadoop00   | 10.20.17.70  | NameNode(Active)   | Master(Active)   |   none
>> hadoop01   | 10.20.17.71  | NameNode(Standby)| Master(Standby) |   none
>> hadoop02   | 10.20.17.72  | ResourceManager(Active)| none          |
>>  none
>> hadoop03   | 10.20.17.73  | ResourceManager(Standby)| none        |  none
>> hadoop04   | 10.20.17.74  | DataNode              |  Worker              |
>> JournalNode
>> hadoop05   | 10.20.17.75  | DataNode              |  Worker              |
>> JournalNode
>> hadoop06   | 10.20.17.76  | DataNode              |  Worker              |
>> JournalNode
>>
>>    Now my *purpose* is to develop Hadoop/Spark applications on my own
>> computer(IP: 10.20.6.23) and submit them to the remote cluster. As all the
>> other guys in our group are in the habit of eclipse on Windows, I'm trying
>> to work on this. I have successfully submitted the WordCount MapReduce job
>> to YARN and it run smoothly through eclipse and Windows. But when I tried
>> to
>> run the Spark WordCount, it gives me the following error in the eclipse
>> console:
>>
>> 15/12/23 11:15:30 INFO AppClient$ClientEndpoint: Connecting to master
>> spark://10.20.17.70:7077...
>> 15/12/23 11:15:50 ERROR SparkUncaughtExceptionHandler: Uncaught exception
>> in
>> thread Thread[appclient-registration-retry-thread,5,main]
>> java.util.concurrent.RejectedExecutionException: Task
>> java.util.concurrent.FutureTask@29ed85e7 rejected from
>> java.util.concurrent.ThreadPoolExecutor@28f21632[Running, pool size = 1,
>> active threads = 0, queued tasks = 0, completed tasks = 1]
>>         at
>>
>> java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(Unknown
>> Source)
>>         at java.util.concurrent.ThreadPoolExecutor.reject(Unknown Source)
>>         at java.util.concurrent.ThreadPoolExecutor.execute(Unknown Source)
>>         at java.util.concurrent.AbstractExecutorService.submit(Unknown
>> Source)
>>         at
>>
>> org.apache.spark.deploy.client.AppClient$ClientEndpoint$$anonfun$tryRegisterAllMasters$1.apply(AppClient.scala:96)
>>         at
>>
>> org.apache.spark.deploy.client.AppClient$ClientEndpoint$$anonfun$tryRegisterAllMasters$1.apply(AppClient.scala:95)
>>         at
>>
>> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
>>         at
>>
>> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
>>         at
>>
>> scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
>>         at
>> scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108)
>>         at
>> scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
>>         at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:108)
>>         at
>>
>> org.apache.spark.deploy.client.AppClient$ClientEndpoint.tryRegisterAllMasters(AppClient.scala:95)
>>         at
>>
>> org.apache.spark.deploy.client.AppClient$ClientEndpoint.org$apache$spark$deploy$client$AppClient$ClientEndpoint$$registerWithMaster(AppClient.scala:121)
>>         at
>>
>> org.apache.spark.deploy.client.AppClient$ClientEndpoint$$anon$2$$anonfun$run$1.apply$mcV$sp(AppClient.scala:132)
>>         at org.apache.spark.util.Utils$.tryOrExit(Utils.scala:1119)
>>         at
>>
>> org.apache.spark.deploy.client.AppClient$ClientEndpoint$$anon$2.run(AppClient.scala:124)
>>         at java.util.concurrent.Executors$RunnableAdapter.call(Unknown
>> Source)
>>         at java.util.concurrent.FutureTask.runAndReset(Unknown Source)
>>         at
>>
>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(Unknown
>> Source)
>>         at
>>
>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown
>> Source)
>>         at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown
>> Source)
>>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown
>> Source)
>>         at java.lang.Thread.run(Unknown Source)
>> 15/12/23 11:15:50 INFO DiskBlockManager: Shutdown hook called
>> 15/12/23 11:15:50 INFO ShutdownHookManager: Shutdown hook called
>>
>>     Then I checked the Spark Master log, and find the following critical
>> statements:
>>
>> 15/12/23 11:15:33 ERROR ErrorMonitor: dropping message [class
>> akka.actor.ActorSelectionMessage] for non-local recipient
>> [Actor[akka.tcp://sparkMaster@10.20.17.70:7077/]] arriving at
>> [akka.tcp://sparkMaster@10.20.17.70:7077] inbound addresses are
>> [akka.tcp://sparkMaster@hadoop00:7077]
>> akka.event.Logging$Error$NoCause$
>> 15/12/23 11:15:53 INFO Master: 10.20.6.23:56374 got disassociated,
>> removing
>> it.
>> 15/12/23 11:15:53 INFO Master: 10.20.6.23:56374 got disassociated,
>> removing
>> it.
>> 15/12/23 11:15:53 WARN ReliableDeliverySupervisor: Association with remote
>> system [akka.tcp://sparkDriver@10.20.6.23:56374] has failed, address is
>> now
>> gated for [5000] ms. Reason: [Disassociated]
>>
>>     Here's my Scala code:
>>
>>    object WordCount{
>>   def main(args: Array[String]){
>>     val conf = new SparkConf().setAppName("Scala
>> WordCount").setMaster("spark://10.20.17.70:7077
>> ").setJars(List("C:\\Temp\\test.jar"));
>>     val sc = new SparkContext(conf);
>>     val textFile = sc.textFile("hdfs://10.20.17.70:9000/wc/indata/wht.txt
>> ");
>>     textFile.flatMap(_.split(" ")).map((_,
>> 1)).reduceByKey(_+_).collect().foreach(println);
>>   }
>> }
>>
>>     To solve the problem, I tried the following:
>>
>>     (1) run spark-shell to check the Scala version, and proved that to be
>> 2.10.4 and compatible with the eclipse-scala plugin.
>>     (2) run spark-submit on the SparkPi examle by specifying the --master
>> param to "10.20.17.70:7077", and it successfully worked out the result. I
>> was also able to see the application history on the Master's Web UI.
>>     (3) I turned off the firewall on my Windows machine.
>>
>>     Unfortunately, the error message remains. Could anybody give me some
>> suggestions ? Thanks very much!
>>
>> Yours Sincerely,
>> Yefeng
>>
>>
>>
>> --
>> View this message in context:
>> http://apache-spark-user-list.1001560.n3.nabble.com/Problem-of-submitting-Spark-task-to-cluster-from-eclipse-IDE-on-Windows-tp25778.html
>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
>> For additional commands, e-mail: user-help@spark.apache.org
>>
>>
>

Mime
View raw message