spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From lonikar <>
Subject Re: Code generation for GPU
Date Mon, 07 Sep 2015 19:44:14 GMT
Hi Reynold,

Thanks for responding. I was waiting for this on the spark user group and my
own email id since I had not posted this on spark dev. Just saw your reply.

1. I figured the various code generation classes have either *apply* or
*eval* method depending on whether it computes something or uses expression
as filter. And the code that executes this generated code is in

2. If the vectorization is difficult or a major effort, I am not sure how I
am going to implement even a glimpse of changes I would like to. I think I
will have to satisfied with only a partial effort. Batching rows defeats the
purpose as I have found that it consumes a considerable amount of CPU cycles
and producing one row at a time also takes away the performance benefit.
Whats really required is to access a large partition and produce the result
partition in one shot. 

I think I will have to severely limit the scope of my talk in that case. Or
re-orient it to propose the changes instead of presenting the results of
execution on GPU. Please suggest since you seem to have selected the talk.

3. I agree, its pretty high paced development. I have started working on
1.5.1 spapshot.

4. How do I tune the batch size (number of rows in the ByteBuffer)? Is it
through the property spark.sql.inMemoryColumnarStorage.batchSize?


View this message in context:
Sent from the Apache Spark Developers List mailing list archive at

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message