My driver program runs a spark streaming job.  And it spawns a thread by itself only in the onStart() function below Other than that it doesn't spawn any other threads. It only calls MapToPair, ReduceByKey, forEachRDD, Collect functions.
public class NSQReceiver extends Receiver<String> {

private String topic="";

public NSQReceiver(String topic) {
super(StorageLevel.MEMORY_AND_DISK_2());
this.topic = topic;
}

@Override
public void onStart() {
new Thread() {
@Override public void run() {
receive();
}
}.start();
}
}

Environment info:
Java 8
Scala 2.11.8
Spark 2.0.0
More than happy to share any other info you may need.

On Mon, Oct 31, 2016 at 11:05 AM, Jakob Odersky <jakob@odersky.com> wrote:
 > how do I tell my spark driver program to not create so many?

This may depend on your driver program. Do you spawn any threads in
it? Could you share some more information on the driver program, spark
version and your environment? It would greatly help others to help you

On Mon, Oct 31, 2016 at 3:47 AM, kant kodali <kanth909@gmail.com> wrote:
> The source of my problem is actually that I am running into the following
> error. This error seems to happen after running my driver program for 4
> hours.
>
> "Exception in thread "ForkJoinPool-50-worker-11" Exception in thread
> "dag-scheduler-event-loop" Exception in thread "ForkJoinPool-50-worker-13"
> java.lang.OutOfMemoryError: unable to create new native thread"
>
> and this wonderful book taught me that the error "unable to create new
> native thread" can happen because JVM is trying to request the OS for a
> thread and it is refusing to do so for the following reasons
>
> 1. The system has actually run out of virtual memory.
> 2. On Unix-style systems, the user has already created (between all programs
> user is running) the maximum number of processes configured for that user
> login. Individual threads are considered a process in that regard.
>
> Option #2 is ruled out in my case because my driver programing is running
> with a userid of root which has  maximum number of processes set to 120242
>
> ulimit -a gives me the following
>
> core file size          (blocks, -c) 0
> data seg size           (kbytes, -d) unlimited
> scheduling priority             (-e) 0
> file size               (blocks, -f) unlimited
> pending signals                 (-i) 120242
> max locked memory       (kbytes, -l) 64
> max memory size         (kbytes, -m) unlimited
> open files                      (-n) 1024
> pipe size            (512 bytes, -p) 8
> POSIX message queues     (bytes, -q) 819200
> real-time priority              (-r) 0
> stack size              (kbytes, -s) 8192
> cpu time               (seconds, -t) unlimited
> max user processes              (-u) 120242
> virtual memory          (kbytes, -v) unlimited
> file locks                      (-x) unlimited
>
> So at this point I do understand that the I am running out of memory due to
> allocation of threads so my biggest question is how do I tell my spark
> driver program to not create so many?
>
> On Mon, Oct 31, 2016 at 3:25 AM, Sean Owen <sowen@cloudera.com> wrote:
>>
>> ps -L [pid] is what shows threads. I am not sure this is counting what you
>> think it does. My shell process has about a hundred threads, and I can't
>> imagine why one would have thousands unless your app spawned them.
>>
>> On Mon, Oct 31, 2016 at 10:20 AM kant kodali <kanth909@gmail.com> wrote:
>>>
>>> when I do
>>>
>>> ps -elfT | grep "spark-driver-program.jar" | wc -l
>>>
>>> The result is around 32K. why does it create so many threads how can I
>>> limit this?
>
>