spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Connor Zanin <cnnr...@udel.edu>
Subject Fwd: How does the # of tasks affect # of threads?
Date Sat, 01 Aug 2015 20:47:29 GMT
Hello,

I am having an issue when I run a word count job. I have included the
source and log files for reference. The job finishes successfully, but
about halfway through I get a java.lang.OutOfMemoryError (could not create
native thread), and this leads to the loss of the Executor. After some
searching I found out this was a problem with the environment and the limit
by the OS on how many threads I could spawn.

However, I had thought that Spark only maintained a thread pool equal in
size to the number of cores available across the nodes (by default), and
schedules tasks dynamically as threads become available. The only Spark
parameter I change is the number of partitions in my RDD.

My question is, how is Spark deciding how many threads to spawn and when?

-- 
Regards,

Connor Zanin
Computer Science
University of Delaware



-- 
Regards,

Connor Zanin
Computer Science
University of Delaware

Mime
View raw message