spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "EcoMotto Inc." <ecomot...@gmail.com>
Subject Getting incorrect weights for LinearRegression
Date Wed, 11 Mar 2015 18:59:01 GMT
Hello,

I am trying to run LinearRegression on a dummy data set, given below. Here
I tried all different settings but I am still failing to reproduce desired
coefficients.

Please help me out, as I facing the same problem in my actual dataset.
Thank you.

This dataset is generated based on the simple equation: Y = 4 + (2 * x1) +
(3 * x2)

*Data:*
y,x1,x2
6.3,1,0.1
8.6,2,0.2
10.9,3,0.3
13.8,4,0.6
16.4,5,0.8
19.6,6,1.2
22.8,7,1.6
25.7,8,1.9
28.3,9,2.1
31.2,10,2.4
34.1,11,2.7

*Spark Code:*
val data = sc.textFile("Data/tempData_1.csv" )

val parsedData = data.mapPartitions(_.drop(1)).map {
                    line =>
                    val parts = line.split(',')
LabeledPoint(parts(0).toDouble,Vectors.dense(Array(1.0,parts(1).toDouble,parts(2).toDouble)))
                  }.cache()

var numIterations = 400
val step = 0.01
val algorithm = new LinearRegressionWithSGD()
algorithm.setIntercept(false) //Even tried with intercept(True) and just
(x1,x2) features
algorithm.optimizer.setStepSize(step)
algorithm.optimizer.setNumIterations(numIterations)
.setUpdater(new SimpleUpdater())
//.setRegParam(0.1)
.setMiniBatchFraction(1.0)

val initialWeights =
Vectors.dense(Array.fill(3)(scala.util.Random.nextDouble()))

val model = algorithm.run(parsedData,initialWeights)
println(s">>>> Model intercept: ${model.intercept}, weights:
${model.weights}")



Regards,
Arun

Mime
View raw message