spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tianshuo Deng <td...@twitter.com.INVALID>
Subject [mllib] GradientDescent requires huge memory for storing weight vector
Date Tue, 13 Jan 2015 00:26:10 GMT
Hi,
Currently in GradientDescent.scala, weights is constructed as a dense
vector:

    initialWeights = Vectors.dense(new Array[Double](numFeatures))

And the numFeatures is determined in the loadLibSVMFile as the max index of
features.

But in the case of using hash function to compute feature index, it results
in a huge dense vector being generated taking lots of memory space.

Any suggestions?

Mime
View raw message