spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ulanov, Alexander" <>
Subject Re: Which linear algebra interface to use within Spark MLlib?
Date Thu, 19 Mar 2015 21:25:29 GMT
Thanks for quick response.

I can use linealg.BLAS.gemm, and this means that I have to use MLlib Matrix. The latter does
not support some useful functionality needed for optimization. For example, creation of Matrix
given matrix size, array and offset in this array. This means that I will need to create matrix
in Breeze and convert it to MLlib. Also, linalg.BLAS misses some useful BLAS functions I need,
that can be found in Breeze (and netlib-java). The same concerns are applicable to MLlib Vector.

Best regards, Alexander

19.03.2015, в 14:16, "Debasish Das" <<>>

I think for Breeze we are focused on dot and dgemv right now (along with several other matrix
vector style operations)...

For dgemm it is tricky since you need to do add dgemm for both DenseMatrix and CSCMatrix...and
for CSCMatrix you need to get something like SuiteSparse which is under we have
to think more on it..

For now can't you use dgemm directly from mllib.linalg.BLAS ? It's in master...

On Thu, Mar 19, 2015 at 1:49 PM, Ulanov, Alexander <<>>
Thank you! When do you expect to have gemm in Breeze and that version of Breeze to ship with

Also, could someone please elaborate on the linalg.BLAS and Matrix? Are they going to be developed
further, should in long term all developers use them?

Best regards, Alexander

18.03.2015, в 23:21, "Debasish Das" <<>>

dgemm dgemv and dot come to Breeze and Spark through netlib-java....

Right now both in dot and dgemv Breeze does a extra memory allocate but we already found the
issue and we are working on adding a common trait that will provide a sink operation (basically
memory will be allocated by user)...adding more BLAS operators in breeze will also help in
general as lot more operations are defined over there...

On Wed, Mar 18, 2015 at 8:09 PM, Ulanov, Alexander <<>>

Currently I am using Breeze within Spark MLlib for linear algebra. I would like to reuse previously
allocated matrices for storing the result of matrices multiplication, i.e. I need to use "gemm"
function C:=q*A*B+p*C, which is missing in Breeze (Breeze automatically allocates a new matrix
to store the result of multiplication). Also, I would like to minimize gemm calls that Breeze
does. Should I use mllib.linalg.BLAS functions instead? While it has gemm and axpy, it has
rather limited number of operations. For example, I need sum of the matrix by row or by columns,
or applying a function to all elements in a matrix. Also, MLlib Vector and Matrix interfaces
that linalg.BLAS operates seems to be rather undeveloped. Should I use plain netlib-java instead
(will it remain in MLlib in future releases)?

Best regards, Alexander

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message