spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mich Talebzadeh <mich.talebza...@gmail.com>
Subject Re: What is the interpretation of Cores in Spark doc
Date Mon, 13 Jun 2016 07:02:33 GMT
Hi,

It is not the issue of testing anything. I was referring to documentation
that clearly use the term "threads". As I said and showed before, one line
is using the term "thread" and the next one "logical cores".


HTH

Dr Mich Talebzadeh



LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*



http://talebzadehmich.wordpress.com



On 12 June 2016 at 23:57, Daniel Darabos <daniel.darabos@lynxanalytics.com>
wrote:

> Spark is a software product. In software a "core" is something that a
> process can run on. So it's a "virtual core". (Do not call these "threads".
> A "thread" is not something a process can run on.)
>
> local[*] uses java.lang.Runtime.availableProcessors()
> <https://github.com/apache/spark/blob/v1.6.1/core/src/main/scala/org/apache/spark/SparkContext.scala#L2608>.
> Since Java is software, this also returns the number of virtual cores. (You
> can test this easily.)
>
>
> On Sun, Jun 12, 2016 at 9:23 PM, Mich Talebzadeh <
> mich.talebzadeh@gmail.com> wrote:
>
>>
>> Hi,
>>
>> I was writing some docs on Spark P&T and came across this.
>>
>> It is about the terminology or interpretation of that in Spark doc.
>>
>> This is my understanding of cores and threads.
>>
>>  Cores are physical cores. Threads are virtual cores. Cores with 2
>> threads is called hyper threading technology so 2 threads per core makes
>> the core work on two loads at same time. In other words, every thread takes
>> care of one load.
>>
>> Core has its own memory. So if you have a dual core with hyper threading,
>> the core works with 2 loads each at same time because of the 2 threads per
>> core, but this 2 threads will share memory in that core.
>>
>> Some vendors as I am sure most of you aware charge licensing per core.
>>
>> For example on the same host that I have Spark, I have a SAP product that
>> checks the licensing and shuts the application down if the license does not
>> agree with the cores speced.
>>
>> This is what it says
>>
>> ./cpuinfo
>> License hostid:        00e04c69159a 0050b60fd1e7
>> Detected 12 logical processor(s), 6 core(s), in 1 chip(s)
>>
>> So here I have 12 logical processors  and 6 cores and 1 chip. I call
>> logical processors as threads so I have 12 threads?
>>
>> Now if I go and start worker process ${SPARK_HOME}/sbin/start-slaves.sh,
>> I see this in GUI page
>>
>> [image: Inline images 1]
>>
>> it says 12 cores but I gather it is threads?
>>
>>
>> Spark document
>> <http://spark.apache.org/docs/latest/submitting-applications.html>
>> states and I quote
>>
>>
>> [image: Inline images 2]
>>
>>
>>
>> OK the line local[k] adds  ..  *set this to the number of cores on your
>> machine*
>>
>>
>> But I know that it means threads. Because if I went and set that to 6, it
>> would be only 6 threads as opposed to 12 threads.
>>
>>
>> the next line local[*] seems to indicate it correctly as it refers to
>> "logical cores" that in my understanding it is threads.
>>
>>
>> I trust that I am not nitpicking here!
>>
>>
>> Cheers,
>>
>>
>>
>> Dr Mich Talebzadeh
>>
>>
>>
>> LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>>
>>
>>
>> http://talebzadehmich.wordpress.com
>>
>>
>>
>
>

Mime
View raw message