systemml-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Niketan Pansare" <npan...@us.ibm.com>
Subject Re: Discussion on GPU backend
Date Wed, 18 May 2016 17:55:01 GMT

Hi Deron,

Good points. I vote that we keep JCUDA and other accelerators we add as an
external dependency. This means the user will have to ensure JCuda.jar in
the class path and JCuda.DLL/JCuda.so in the LD_LIBRARY_PATH.

I don't think JCuda.jar is platform-specific.

Thanks,

Niketan Pansare
IBM Almaden Research Center
E-mail: npansar At us.ibm.com
http://researcher.watson.ibm.com/researcher/view.php?person=us-npansar



From:	Deron Eriksson <deroneriksson@gmail.com>
To:	dev@systemml.incubator.apache.org
Date:	05/18/2016 10:51 AM
Subject:	Re: Discussion on GPU backend



Hi,

I'm wondering what would be a good way to handle JCuda in terms of the
build release packages. Currently we have 11 artifacts that we are
building:
   systemml-0.10.0-incubating-SNAPSHOT-inmemory.jar
   systemml-0.10.0-incubating-SNAPSHOT-javadoc.jar
   systemml-0.10.0-incubating-SNAPSHOT-sources.jar
   systemml-0.10.0-incubating-SNAPSHOT-src.tar.gz
   systemml-0.10.0-incubating-SNAPSHOT-src.zip
   systemml-0.10.0-incubating-SNAPSHOT-standalone.jar
   systemml-0.10.0-incubating-SNAPSHOT-standalone.tar.gz
   systemml-0.10.0-incubating-SNAPSHOT-standalone.zip
   systemml-0.10.0-incubating-SNAPSHOT.jar
   systemml-0.10.0-incubating-SNAPSHOT.tar.gz
   systemml-0.10.0-incubating-SNAPSHOT.zip

It looks like JCuda is platform-specific, so you typically need different
jars/dlls/sos/etc for each platform. If I'm understanding things correctly,
if we generated Windows/Linux/LinuxPowerPC/MacOS-specific SystemML
artifacts for JCuda, we'd potentially have an enormous number of artifacts.

Is this something that could be potentially handled by specific profiles in
the pom so that a user might be able to do something like "mvn clean
package -P jcuda-windows" so that a user could be responsible for building
the platform-specific SystemML jar for jcuda? Or is this something that
could be handled differently, by putting the platform-specific jcuda jar on
the classpath and any dlls or other needed libraries on the path?

Deron



On Tue, May 17, 2016 at 10:50 PM, Niketan Pansare <npansar@us.ibm.com>
wrote:

> Hi Luciano,
>
> Like all our backends, there is no change in the programming model. The
> user submits a DML script and specifies whether she wants to use an
> accelerator. Assuming that we compile jcuda jars into SystemML.jar, the
> user can use GPU backend using following command:
> spark-submit --master yarn-client ... -f MyAlgo.dml -accelerator -exec
> hybrid_spark
>
> The user also needs to set LD_LIBRARY_PATH that points to JCuda DLL or so
> files. Please see *https://issues.apache.org/jira/browse/SPARK-1720*
> <https://issues.apache.org/jira/browse/SPARK-1720> ... For example: the
> user can add following to spark-env.sh
> export LD_LIBRARY_PATH=<path to jcuda so>:$LD_LIBRARY_PATH
>
> The first version of GPU backend will only accelerate CP. In this case,
we
> have four types of instructions:
> 1. CP
> 2. GPU (requires GPU on the driver)
> 3. SPARK
> 4. MR
>
> Note, the first version will require the CUDA/JCuda dependency to be
> installed on the driver only.
>
> The next version will accelerate our distributed instructions as well. In
> this case, we will have six types of instructions:
> 1. CP
> 2. GPU
> 3. SPARK
> 4. MR
> 5. SPARK-GPU (requires GPU cluster)
> 6. MR-GPU (requires GPU cluster)
>
> Thanks,
>
> Niketan Pansare
> IBM Almaden Research Center
> E-mail: npansar At us.ibm.com
> http://researcher.watson.ibm.com/researcher/view.php?person=us-npansar
>
> [image: Inactive hide details for Luciano Resende ---05/17/2016 09:13:24
> PM---Great to see detailed information on this topic Niketan,]Luciano
> Resende ---05/17/2016 09:13:24 PM---Great to see detailed information on
> this topic Niketan, I guess I have missed when you posted it in
>
> From: Luciano Resende <luckbr1975@gmail.com>
> To: dev@systemml.incubator.apache.org
> Date: 05/17/2016 09:13 PM
> Subject: Re: Discussion on GPU backend
> ------------------------------
>
>
>
> Great to see detailed information on this topic Niketan, I guess I have
> missed when you posted it initially.
>
> Could you elaborate a little more on what is the programming model for
when
> the user wants to leverage GPU ? Also, today I can submit a job to spark
> using --jars and it will handle copying the dependencies to the worker
> nodes. If my application wants to leverage GPU, what extras dependencies
> will be required on the worker nodes, and how they are going to be
> installed/updated on the Spark cluster ?
>
>
>
> On Tue, May 3, 2016 at 1:26 PM, Niketan Pansare <npansar@us.ibm.com>
> wrote:
>
> >
> >
> > Hi all,
> >
> > I have updated the design document for our GPU backend in the JIRA
> > https://issues.apache.org/jira/browse/SYSTEMML-445. The implementation
> > details are based on the prototype I created and is available in PR
> > https://github.com/apache/incubator-systemml/pull/131. Once we are done
> > with the discussion, I can clean up and separate out the GPU backend in
a
> > separate PR for easier review :)
> >
> > Here are key design points:
> > A GPU backend would implement two abstract classes:
> >    1.   GPUContext
> >    2.   GPUObject
> >
> >
> >
> > The GPUContext is responsible for GPU memory management and gets
> call-backs
> > from SystemML's bufferpool on following methods:
> >    1.   void acquireRead(MatrixObject mo)
> >    2.   void acquireModify(MatrixObject mo)
> >    3.   void release(MatrixObject mo, boolean isGPUCopyModified)
> >    4.   void exportData(MatrixObject mo)
> >    5.   void evict(MatrixObject mo)
> >
> >
> >
> > A GPUObject (like RDDObject and BroadcastObject) is stored in
> CacheableData
> > object. It contains following methods that are called back from the
> > corresponding GPUContext:
> >    1.   void allocateMemoryOnDevice()
> >    2.   void deallocateMemoryOnDevice()
> >    3.   long getSizeOnDevice()
> >    4.   void copyFromHostToDevice()
> >    5.   void copyFromDeviceToHost()
> >
> >
> >
> > In the initial implementation, we will add JCudaContext and
JCudaPointer
> > that will extend the above abstract classes respectively. The
> JCudaContext
> > will be created by ExecutionContextFactory depending on the
> user-specified
> > accelarator. Analgous to MR/SPARK/CP, we will add a new ExecType: GPU
and
> > implement GPU instructions.
> >
> > The above design is general enough so that other people can implement
> > custom accelerators (for example: OpenCL) and also follows the design
> > principles of our CP bufferpool.
> >
> > Thanks,
> >
> > Niketan Pansare
> > IBM Almaden Research Center
> > E-mail: npansar At us.ibm.com
> > http://researcher.watson.ibm.com/researcher/view.php?person=us-npansar
> >
>
>
>
> --
> Luciano Resende
> http://twitter.com/lresende1975
> http://lresende.blogspot.com/
>
>
>
>



Mime
  • Unnamed multipart/related (inline, None, 0 bytes)
View raw message