spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yanbo Liang <yanboha...@gmail.com>
Subject Re: Inaccurate Estimate of weights model from StreamingLinearRegressionWithSGD
Date Thu, 27 Nov 2014 06:21:54 GMT
Hi Tri,

Maybe my latest responds for your problem is lost, whatever, the following
code snippet can run correctly.

val model = new
StreamingLinearRegressionWithSGD().setInitialWeights(Vectors.zeros(args(3).toInt))


model.algorithm.setIntercept(true)


Because that all setXXX() function in StreamingLinearRegressionWithSGD will
return this.type which is an instance of itself,
so we need set other configuration in a separate line w/o return value.

2014-11-27 1:04 GMT+08:00 Bui, Tri <Tri.Bui@verizonwireless.com.invalid>:

> Thanks Yanbo!
>
>
>
> Modified code below:
>
>
>
> val conf = new
> SparkConf().setMaster("local[2]").setAppName("StreamingLinearRegression")
>
>     val ssc = new StreamingContext(conf, Seconds(args(2).toLong))
>
>     val trainingData = ssc.textFileStream(args(0)).map(LabeledPoint.parse)
>
>     val testData = ssc.textFileStream(args(1)).map(LabeledPoint.parse)
>
>     val model = new
> StreamingLinearRegressionWithSGD().setInitialWeights(Vectors.zeros(args(3).toInt)).setNumIterations(args(4).toInt).setStepSize(.0001).algorithm.setIntercept(true)
>
>     model.trainOn(trainingData)
>
>     model.predictOnValues(testData.map(lp => (lp.label,
> lp.features))).print()
>
>     ssc.start()
>
>     ssc.awaitTermination()
>
>
>
> But I am getting compile error:
>
> [error]
> /data/project/LinearRegression/src/main/scala/StreamingLinearRegression.scala:54:
> value trainOn is not a member
>
> of org.apache.spark.mllib.regression.LinearRegressionWithSGD
>
> [error]     model.trainOn(trainingData)
>
> [error]           ^
>
> [error]
> /data/project/LinearRegression/src/main/scala/StreamingLinearRegression.scala:55:
> value predictOnValues is not a
>
> member of org.apache.spark.mllib.regression.LinearRegressionWithSGD
>
> [error]     model.predictOnValues(testData.map(lp => (lp.label,
> lp.features))).print()
>
> [error]           ^
>
> [error] two errors found
>
> [error] (compile:compile) Compilation failed
>
>
>
> Thanks
>
> Tri
>
>
>
> *From:* Yanbo Liang [mailto:yanbohappy@gmail.com]
> *Sent:* Tuesday, November 25, 2014 8:57 PM
> *To:* Bui, Tri
> *Cc:* user@spark.apache.org
> *Subject:* Re: Inaccurate Estimate of weights model from
> StreamingLinearRegressionWithSGD
>
>
>
> Hi Tri,
>
>
>
> setIntercept() is not a member function
> of StreamingLinearRegressionWithSGD, it's a member function
> of LinearRegressionWithSGD(GeneralizedLinearAlgorithm) which is a member
> variable(named algorithm) of StreamingLinearRegressionWithSGD.
>
>
>
> So you need to change your code to:
>
> val model = new
> StreamingLinearRegressionWithSGD().setInitialWeights(Vectors.zeros(args(3).toInt))
>
> .algorithm.setIntercept(true)
>
>
>
> Thanks
>
> Yanbo
>
>
>
>
>
> 2014-11-25 23:51 GMT+08:00 Bui, Tri <Tri.Bui@verizonwireless.com.invalid>:
>
> Thanks Liang!
>
>
>
> It was my bad, I fat finger one of the data point, correct it and the
> result match with yours.
>
>
>
> I am still not able to get the intercept.  I am getting   [error]
> /data/project/LinearRegression/src/main/scala/StreamingLinearRegression.scala:47:
> value setIntercept
>
> mber of org.apache.spark.mllib.regression.StreamingLinearRegressionWithSGD
>
>
>
> I try code below:
>
> val model = new
> StreamingLinearRegressionWithSGD().setInitialWeights(Vectors.zeros(args(3).toInt))
>
> model.setIntercept(addIntercept = true).trainOn(trainingData)
>
>
>
> and:
>
>
>
> val model = new
> StreamingLinearRegressionWithSGD().setInitialWeights(Vectors.zeros(args(3).toInt))
>
> .setIntercept(true)
>
>
>
> But still get compilation error.
>
>
>
> Thanks
>
> Tri
>
>
>
>
>
>
>
>
>
> *From:* Yanbo Liang [mailto:yanbohappy@gmail.com]
> *Sent:* Tuesday, November 25, 2014 4:08 AM
> *To:* Bui, Tri
> *Cc:* user@spark.apache.org
> *Subject:* Re: Inaccurate Estimate of weights model from
> StreamingLinearRegressionWithSGD
>
>
>
> The case run correctly in my environment.
>
>
>
> 14/11/25 17:48:20 INFO regression.StreamingLinearRegressionWithSGD: Model
> updated at time 1416908900000 ms
>
> 14/11/25 17:48:20 INFO regression.StreamingLinearRegressionWithSGD:
> Current model: weights, [0.9999999999998588]
>
>
>
> Can you provide more detail information if it is convenience?
>
>
>
> Turn on the intercept value can be set as following:
>
> val model = new StreamingLinearRegressionWithSGD()
>
>       .algorithm.setIntercept(true)
>
>
>
> 2014-11-25 3:31 GMT+08:00 Bui, Tri <Tri.Bui@verizonwireless.com.invalid>:
>
> Hi,
>
>
>
> I am getting incorrect weights model from StreamingLinearRegressionwith
> SGD.
>
>
>
> One feature Input data is:
>
>
>
> (1,[1])
>
> (2,[2])
>
> …
>
> .
>
> (20,[20])
>
>
>
> The result from the Current model: weights is [-4.432]….which is not
> correct.
>
>
>
> Also, how do I turn on the intercept value for the
> StreamingLinearRegression ?
>
>
>
> Thanks
>
> Tri
>
>
>
>
>

Mime
View raw message