spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yuhao Yang <hhb...@gmail.com>
Subject Re: spark linear regression error training dataset is empty
Date Sun, 25 Dec 2016 20:23:07 GMT
Hi Xiaomeng,

Have you tried to confirm the DataFrame contents before fitting? like
assembleddata.show()
before fitting.

Regards,
Yuhao

2016-12-21 10:05 GMT-08:00 Xiaomeng Wan <shawnwan@gmail.com>:

> Hi,
>
> I am running linear regression on a dataframe and get the following error:
>
> Exception in thread "main" java.lang.AssertionError: assertion failed:
> Training dataset is empty.
>
> at scala.Predef$.assert(Predef.scala:170)
>
> at org.apache.spark.ml.optim.WeightedLeastSquares$Aggregator.validate(
> WeightedLeastSquares.scala:247)
>
> at org.apache.spark.ml.optim.WeightedLeastSquares.fit(
> WeightedLeastSquares.scala:82)
>
> at org.apache.spark.ml.regression.LinearRegression.
> train(LinearRegression.scala:180)
>
> at org.apache.spark.ml.regression.LinearRegression.
> train(LinearRegression.scala:70)
>
> at org.apache.spark.ml.Predictor.fit(Predictor.scala:90)
>
> here is the data and code:
>
> {"label":79.3,"features":{"type":1,"values":[6412.
> 143500000001,888.0,1407.0,1.5844594594594594,10.614,12.07,
> 0.12062966031483012,0.9991237664152219,6.065,0.49751449875724935]}}
>
> {"label":72.3,"features":{"type":1,"values":[6306.
> 044500000001,1084.0,1451.0,1.338560885608856,7.018,12.04,0.
> 41710963455149497,0.9992054343916128,6.05,0.4975083056478405]}}
>
> {"label":76.7,"features":{"type":1,"values":[6142.
> 920000000003,1494.0,1437.0,0.9618473895582329,7.939,12.06,
> 0.34170812603648426,0.9992216101762574,6.06,0.49751243781094534]}}
>
> val lr = new LinearRegression().setMaxIter(300).setFeaturesCol("features")
>
> val lrModel = lr.fit(assembleddata)
>
> Any clue or inputs are appreciated.
>
>
> Regards,
>
> Shawn
>
>
>

Mime
View raw message