spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Oleg Ruchovets <oruchov...@gmail.com>
Subject pyspark yarn got exception
Date Tue, 02 Sep 2014 09:42:01 GMT
Hi ,
   I've installed pyspark on hpd hortonworks cluster.
  Executing pi example:

command:
       spark-1.0.1.2.1.3.0-563-bin-2.4.0.2.1.3.0-563]# ./bin/spark-submit
--master spark://10.193.1.71:7077   examples/src/main/python/pi.py   1000

exception:

    14/09/02 17:34:02 INFO SecurityManager: Using Spark's default log4j
profile: org/apache/spark/log4j-defaults.properties
14/09/02 17:34:02 INFO SecurityManager: Changing view acls to: root
14/09/02 17:34:02 INFO SecurityManager: SecurityManager: authentication
disabled; ui acls disabled; users with view permissions: Set(root)
14/09/02 17:34:02 INFO Slf4jLogger: Slf4jLogger started
14/09/02 17:34:02 INFO Remoting: Starting remoting
14/09/02 17:34:03 INFO Remoting: Remoting started; listening on addresses
:[akka.tcp://spark@HDOP-M.AGT:41059]
14/09/02 17:34:03 INFO Remoting: Remoting now listens on addresses:
[akka.tcp://spark@HDOP-M.AGT:41059]
14/09/02 17:34:03 INFO SparkEnv: Registering MapOutputTracker
14/09/02 17:34:03 INFO SparkEnv: Registering BlockManagerMaster
14/09/02 17:34:03 INFO DiskBlockManager: Created local directory at
/tmp/spark-local-20140902173403-cda8
14/09/02 17:34:03 INFO MemoryStore: MemoryStore started with capacity 294.9
MB.
14/09/02 17:34:03 INFO ConnectionManager: Bound socket to port 34931 with
id = ConnectionManagerId(HDOP-M.AGT,34931)
14/09/02 17:34:03 INFO BlockManagerMaster: Trying to register BlockManager
14/09/02 17:34:03 INFO BlockManagerInfo: Registering block manager
HDOP-M.AGT:34931 with 294.9 MB RAM
14/09/02 17:34:03 INFO BlockManagerMaster: Registered BlockManager
14/09/02 17:34:03 INFO HttpServer: Starting HTTP Server
14/09/02 17:34:03 INFO HttpBroadcast: Broadcast server started at
http://10.193.1.71:54341
14/09/02 17:34:03 INFO HttpFileServer: HTTP File server directory is
/tmp/spark-77c7a7dc-181e-4069-a014-8103a6a6330a
14/09/02 17:34:03 INFO HttpServer: Starting HTTP Server
14/09/02 17:34:04 INFO SparkUI: Started SparkUI at http://HDOP-M.AGT:4040
14/09/02 17:34:04 WARN NativeCodeLoader: Unable to load native-hadoop
library for your platform... using builtin-java classes where applicable
14/09/02 17:34:04 INFO Utils: Copying
/root/spark-1.0.1.2.1.3.0-563-bin-2.4.0.2.1.3.0-563/examples/src/main/python/pi.py
to /tmp/spark-f2e0cc0f-59cb-4f6c-9d48-f16205a40c7e/pi.py
14/09/02 17:34:04 INFO SparkContext: Added file
file:/root/spark-1.0.1.2.1.3.0-563-bin-2.4.0.2.1.3.0-563/examples/src/main/python/pi.py
at http://10.193.1.71:52938/files/pi.py with timestamp 1409650444941
14/09/02 17:34:05 INFO AppClient$ClientActor: Connecting to master
spark://10.193.1.71:7077...
14/09/02 17:34:05 WARN AppClient$ClientActor: Could not connect to
akka.tcp://sparkMaster@10.193.1.71:7077:
akka.remote.EndpointAssociationException: Association failed with
[akka.tcp://sparkMaster@10.193.1.71:7077]
14/09/02 17:34:05 WARN AppClient$ClientActor: Could not connect to
akka.tcp://sparkMaster@10.193.1.71:7077:
akka.remote.EndpointAssociationException: Association failed with
[akka.tcp://sparkMaster@10.193.1.71:7077]
14/09/02 17:34:05 WARN AppClient$ClientActor: Could not connect to
akka.tcp://sparkMaster@10.193.1.71:7077:
akka.remote.EndpointAssociationException: Association failed with
[akka.tcp://sparkMaster@10.193.1.71:7077]
14/09/02 17:34:05 WARN AppClient$ClientActor: Could not connect to
akka.tcp://sparkMaster@10.193.1.71:7077:
akka.remote.EndpointAssociationException: Association failed with
[akka.tcp://sparkMaster@10.193.1.71:7077]
14/09/02 17:34:25 INFO AppClient$ClientActor: Connecting to master
spark://10.193.1.71:7077...
14/09/02 17:34:25 WARN AppClient$ClientActor: Could not connect to
akka.tcp://sparkMaster@10.193.1.71:7077:
akka.remote.EndpointAssociationException: Association failed with
[akka.tcp://sparkMaster@10.193.1.71:7077]
14/09/02 17:34:25 WARN AppClient$ClientActor: Could not connect to
akka.tcp://sparkMaster@10.193.1.71:7077:
akka.remote.EndpointAssociationException: Association failed with
[akka.tcp://sparkMaster@10.193.1.71:7077]
14/09/02 17:34:25 WARN AppClient$ClientActor: Could not connect to
akka.tcp://sparkMaster@10.193.1.71:7077:
akka.remote.EndpointAssociationException: Association failed with
[akka.tcp://sparkMaster@10.193.1.71:7077]
14/09/02 17:34:25 WARN AppClient$ClientActor: Could not connect to
akka.tcp://sparkMaster@10.193.1.71:7077:
akka.remote.EndpointAssociationException: Association failed with
[akka.tcp://sparkMaster@10.193.1.71:7077]
Traceback (most recent call last):
  File
"/root/spark-1.0.1.2.1.3.0-563-bin-2.4.0.2.1.3.0-563/examples/src/main/python/pi.py",
line 38, in <module>
    count = sc.parallelize(xrange(1, n+1), slices).map(f).reduce(add)
  File
"/root/spark-1.0.1.2.1.3.0-563-bin-2.4.0.2.1.3.0-563/python/pyspark/context.py",
line 271, in parallelize
    jrdd = readRDDFromFile(self._jsc, tempFile.name, numSlices)
  File
"/root/spark-1.0.1.2.1.3.0-563-bin-2.4.0.2.1.3.0-563/python/lib/py4j-0.8.1-src.zip/py4j/java_gateway.py",
line 537, in __call__
  File
"/root/spark-1.0.1.2.1.3.0-563-bin-2.4.0.2.1.3.0-563/python/lib/py4j-0.8.1-src.zip/py4j/protocol.py",
line 300, in get_return_value
py4j.protocol.Py4JJavaError: An error occurred while calling
z:org.apache.spark.api.python.PythonRDD.readRDDFromFile.
: java.lang.OutOfMemoryError: GC overhead limit exceeded
at
org.apache.spark.api.python.PythonRDD$.readRDDFromFile(PythonRDD.scala:279)
at org.apache.spark.api.python.PythonRDD.readRDDFromFile(PythonRDD.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:231)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:379)
at py4j.Gateway.invoke(Gateway.java:259)
at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
at py4j.commands.CallCommand.execute(CallCommand.java:79)
at py4j.GatewayConnection.run(GatewayConnection.java:207)
at java.lang.Thread.run(Thread.java:744)



Question:
    how can I know spark master and port? Where is it defined?

Thanks
Oleg.

Mime
View raw message