systemml-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Niketan Pansare" <>
Subject Discussion on GPU backend
Date Tue, 03 May 2016 20:26:19 GMT

Hi all,

I have updated the design document for our GPU backend in the JIRA The implementation
details are based on the prototype I created and is available in PR Once we are done
with the discussion, I can clean up and separate out the GPU backend in a
separate PR for easier review :)

Here are key design points:
A GPU backend would implement two abstract classes:
   1.	GPUContext
   2.	GPUObject

The GPUContext is responsible for GPU memory management and gets call-backs
from SystemML's bufferpool on following methods:
   1.	void acquireRead(MatrixObject mo)
   2.	void acquireModify(MatrixObject mo)
   3.	void release(MatrixObject mo, boolean isGPUCopyModified)
   4.	void exportData(MatrixObject mo)
   5.	void evict(MatrixObject mo)

A GPUObject (like RDDObject and BroadcastObject) is stored in CacheableData
object. It contains following methods that are called back from the
corresponding GPUContext:
   1.	void allocateMemoryOnDevice()
   2.	void deallocateMemoryOnDevice()
   3.	long getSizeOnDevice()
   4.	void copyFromHostToDevice()
   5.	void copyFromDeviceToHost()

In the initial implementation, we will add JCudaContext and JCudaPointer
that will extend the above abstract classes respectively. The JCudaContext
will be created by ExecutionContextFactory depending on the user-specified
accelarator. Analgous to MR/SPARK/CP, we will add a new ExecType: GPU and
implement GPU instructions.

The above design is general enough so that other people can implement
custom accelerators (for example: OpenCL) and also follows the design
principles of our CP bufferpool.


Niketan Pansare
IBM Almaden Research Center
E-mail: npansar At

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message