I used "local[*]". The CPU hits about 80% when there are active jobs, then it drops to about 13% and hand for a very long time.


On Mon, 16 Mar 2015 17:46 Akhil Das <akhil@sigmoidanalytics.com> wrote:
How many threads are you allocating while creating the sparkContext? like local[4] will allocate 4 threads. You can try increasing it to a higher number also try setting level of parallelism to a higher number.

Best Regards

On Mon, Mar 16, 2015 at 9:55 AM, Xi Shen <davidshen84@gmail.com> wrote:

I am running k-means using Spark in local mode. My data set is about 30k records, and I set the k = 1000.

The algorithm starts and finished 13 jobs according to the UI monitor, then it stopped working.

The last log I saw was:

[Spark Context Cleaner] INFO org.apache.spark.ContextCleaner - Cleaned broadcast 16

There're many similar log repeated, but it seems it always stop at the 16th.

If I try to low down the k value, the algorithm will terminated. So I just want to know what's wrong with k=1000.