spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Zhiliang Zhu <zchl.j...@yahoo.com.INVALID>
Subject Re: [SPARK MLLIB] could not understand the wrong and inscrutable result of Linear Regression codes
Date Mon, 26 Oct 2015 04:10:51 GMT
 


     On Monday, October 26, 2015 11:26 AM, Zhiliang Zhu <zchl.jump@yahoo.com.INVALID>
wrote:
   

 Hi DB Tsai,
Thanks very much for your kind help. I  get it now.
I am sorry that there is another issue, the weight/coefficient result is perfect while A is
triangular matrix, however, while A is not triangular matrix (but 
transformed from triangular matrix, still is invertible), the result seems not perfect and
difficult to make it better by resetting the parameter.Would you help comment some about that...
List<LabeledPoint> localTraining = Lists.newArrayList(
      new LabeledPoint(30.0, Vectors.dense(1.0, 2.0, 3.0, 4.0)),
      new LabeledPoint(29.0, Vectors.dense(0.0, 2.0, 3.0, 4.0)),
      new LabeledPoint(25.0, Vectors.dense(0.0, 0.0, 3.0, 4.0)),
      new LabeledPoint(-3.0, Vectors.dense(0.0, 0.0, -1.0, 0.0)));...LinearRegression
lr = new LinearRegression()
      .setMaxIter(20000)
      .setRegParam(0)
      .setElasticNetParam(0);
....
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

It seems that no matter how to reset the parameters for lr , the output of x3 and x4 is always
nearly the same .Whether there is some way to make the result a little better...


------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

x3 and x4 could not become better, the output is:
Final w: [0.9999999477672867,1.9999999748740578,3.5000000112393734,3.500000011239377]  


Thank you,Zhiliang 
 


     On Monday, October 26, 2015 10:25 AM, DB Tsai <dbtsai@dbtsai.com> wrote:
   

 Column 4 is always constant, so no predictive power resulting zero weight.

On Sunday, October 25, 2015, Zhiliang Zhu <zchl.jump@yahoo.com> wrote:

Hi DB Tsai,
Thanks very much for your kind reply help.
As for your comment, I just modified and tested the key part of the codes:
 LinearRegression lr = new LinearRegression()
       .setMaxIter(10000)
       .setRegParam(0)
       .setElasticNetParam(0);  //the number could be reset

 final LinearRegressionModel model = lr.fit(training);
Now the output is much reasonable, however, x4 is always 0 while repeatedly reset those parameters
in lr , would you help some about it how to properly set the parameters ...
Final w: [1.000000127825909,1.999999979185054,2.999999993307136,0.0]

Thank you,Zhiliang

 


     On Monday, October 26, 2015 5:14 AM, DB Tsai <dbtsai@dbtsai.com> wrote:
   

 LinearRegressionWithSGD is not stable. Please use linear regression in
ML package instead.
http://spark.apache.org/docs/latest/ml-linear-methods.html

Sincerely,

DB Tsai
----------------------------------------------------------
Web: https://www.dbtsai.com
PGP Key ID: 0xAF08DF8D


On Sun, Oct 25, 2015 at 10:14 AM, Zhiliang Zhu
<zchl.jump@yahoo.com.invalid> wrote:
> Dear All,
>
> I have some program as below which makes me very much confused and
> inscrutable, it is about multiple dimension linear regression mode, the
> weight / coefficient is always perfect while the dimension is smaller than
> 4, otherwise it is wrong all the time.
> Or, whether the LinearRegressionWithSGD would be selected for another one?
>
> public class JavaLinearRegression {
>  public static void main(String[] args) {
>    SparkConf conf = new SparkConf().setAppName("Linear Regression
> Example");
>    JavaSparkContext sc = new JavaSparkContext(conf);
>    SQLContext jsql = new SQLContext(sc);
>
>    //Ax = b, x = [1, 2, 3, 4] would be the only one output about weight
>    //x1 + 2 * x2 + 3 * x3 + 4 * x4 = y would be the multiple linear mode
>    List<LabeledPoint> localTraining = Lists.newArrayList(
>        new LabeledPoint(30.0, Vectors.dense(1.0, 2.0, 3.0, 4.0)),
>        new LabeledPoint(29.0, Vectors.dense(0.0, 2.0, 3.0, 4.0)),
>        new LabeledPoint(25.0, Vectors.dense(0.0, 0.0, 3.0, 4.0)),
>        new LabeledPoint(16.0, Vectors.dense(0.0, 0.0, 0.0, 4.0)));
>
>    JavaRDD<LabeledPoint> training = sc.parallelize(localTraining).cache();
>
>    // Building the model
>    int numIterations = 1000; //the number could be reset large
>    final LinearRegressionModel model =
> LinearRegressionWithSGD.train(JavaRDD.toRDD(training), numIterations);
>
>    //the coefficient weights are perfect while dimension of LabeledPoint is
> SMALLER than 4.
>    //otherwise the output is always wrong and inscrutable.
>    //for instance, one output is
>    //Final w:
> [2.537341836047772E25,-7.744333206289736E24,6.697875883454909E23,-2.6704705246777624E22]
>    System.out.print("Final w: " + model.weights() + "\n\n");
>  }
> }
>
>  I would appreciate your kind help or guidance very much~~
>
> Thank you!
> Zhiliang
>
>


   


-- 
- DBSent from my iPhone


   

  
Mime
View raw message