spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From <>
Subject CPU Parallelization not being used (local mode)
Date Mon, 27 Jul 2015 19:57:27 GMT
Hi all,

would like some insight. I am currently computing huge databases, and playing with monitoring
and tunning.

When monitoring the multiple cores I have, I see that even when RDDs are parallelized, computation
on the RDD jump from core to core sporadically ( I guess, depending on where the chunk is),
So I see one CORE at 100% usage, and the other ones sitting idle by, after some time when
the task is complete, the procesing jumps into another core, and so on.

can you share any general insight on this situation? Does this depend on the computation?
I have tried serialization and different setups, but I neve see more than 1 Core working at
a spark-submission.

note: This is no cluster mode, just local processors.


View raw message