spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Steve Loughran <>
Subject Re: Code generation for GPU
Date Tue, 08 Sep 2015 11:27:15 GMT

On 7 Sep 2015, at 20:44, lonikar <<>>

2. If the vectorization is difficult or a major effort, I am not sure how I
am going to implement even a glimpse of changes I would like to. I think I
will have to satisfied with only a partial effort. Batching rows defeats the
purpose as I have found that it consumes a considerable amount of CPU cycles
and producing one row at a time also takes away the performance benefit.
Whats really required is to access a large partition and produce the result
partition in one shot.

why not look at the dataframes APIs and the back-end implementations of things which support
it?  The data sources which are columnized from the outset (ORC, parquet) are the ones where
vector operations work well : you can read at of columns, perform a parallel operation, then

If you can hook up to a column structure you may get that speedup.

I think I will have to severely limit the scope of my talk in that case. Or
re-orient it to propose the changes instead of presenting the results of
execution on GPU. Please suggest since you seem to have selected the talk.

It is always essential to have the core of your talk ready before you propose the talk -its
something reviewers (nothing to do with me here) mostly expect. Otherwise you are left in
a panic three days before trying to do bash together some slides you will have to present
to an audience that may include people that know the code better than you. I've been there
-and fear I will be there again in 3 weeks time.

Some general suggestions

  1.  assume the audience knows spark, but not how to code for GPUs: intro that on a slide
or two
  2.  cover the bandwidth problem: how much computation is needed before working with the
GPU is justified
  3.  Look at the body of work of Hadoop MapReduce & GPUs and the limitations (IO bandwidth,
intermediate stage B/W) as well as benefits (perf on CPU workloads, power budget)
  4.  Cover how that's changing: SDDs, in-memory filesystems, whether infiniband would help.
  5.  Try to demo something. It's always nice to show something working at a talk, even if
its just your laptop

View raw message