spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From lonikar <loni...@gmail.com>
Subject Re: Code generation for GPU
Date Mon, 07 Sep 2015 19:44:14 GMT
Hi Reynold,

Thanks for responding. I was waiting for this on the spark user group and my
own email id since I had not posted this on spark dev. Just saw your reply.

1. I figured the various code generation classes have either *apply* or
*eval* method depending on whether it computes something or uses expression
as filter. And the code that executes this generated code is in
sql.execution.basicOperators.scala.

2. If the vectorization is difficult or a major effort, I am not sure how I
am going to implement even a glimpse of changes I would like to. I think I
will have to satisfied with only a partial effort. Batching rows defeats the
purpose as I have found that it consumes a considerable amount of CPU cycles
and producing one row at a time also takes away the performance benefit.
Whats really required is to access a large partition and produce the result
partition in one shot. 

I think I will have to severely limit the scope of my talk in that case. Or
re-orient it to propose the changes instead of presenting the results of
execution on GPU. Please suggest since you seem to have selected the talk.

3. I agree, its pretty high paced development. I have started working on
1.5.1 spapshot.

4. How do I tune the batch size (number of rows in the ByteBuffer)? Is it
through the property spark.sql.inMemoryColumnarStorage.batchSize?

-Kiran



--
View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/Re-Code-generation-for-GPU-tp13954p13989.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org


Mime
View raw message