spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From DB Tsai <dbt...@dbtsai.com>
Subject Re: LogisticRegressionWithLBFGS shows ERRORs
Date Thu, 26 Mar 2015 00:33:25 GMT
We fixed couple issues in breeze LBFGS implementation. Can you try
Spark 1.3 and see if they still exist? Thanks.

Sincerely,

DB Tsai
-------------------------------------------------------
Blog: https://www.dbtsai.com


On Mon, Mar 16, 2015 at 12:48 PM, Chang-Jia Wang <cj@cjwang.us> wrote:
> I just used random numbers.
>
> (My ML lib was spark-mllib_2.10-1.2.1)
>
> Please see the attached log.  In the middle of the log, I dumped the data
> set before feeding into LogisticRegressionWithLBFGS.  The first column
> false/true was the label (attribute “a”), and columns 2-5 (attributes “x”,
> “y”, “z”, and “i”) were the features.  The 6th column was just row ID and
> was not used.
>
> The relationship was arbitrarily: a = (0.3 * x + 0.5 * y - 0.2 *z > 0.4)
>
> After that you can find LBFGS was doing its job and then pumped out the
> error messages.
>
> The model showed coefficients:
>
> 396.57624765427323, x
> 662.7969020937115, y
> -259.0975519038385, z
> 12.352037503257826, i
> -538.8516249699426, @a
>
> The last one was the intercept.  As you can see, the model seemed close
> enough.
>
> After that I fed the same data back to the model to see how the predictions
> worked.   (here attribute “a” was the prediction and “aa” was the original
> label)  I only displayed 20 rows.
>
> The error rate showed 2 errors out of 1000.
>
> count(INTEGER), errorRate(DOUBLE), countDiff(INTEGER)
> key=[], rows=1
> 1000, 0.0020000000949949026, 2
>
> So, the algorithm worked, just spitting out the errors was kind of annoying.
> If this is not result affecting, maybe it should be warning or info.
>
> C.J.
>
>
>
>
>
>
>
> On Mar 15, 2015, at 12:42 AM, DB Tsai <dbtsai@dbtsai.com> wrote:
>
> In LBFGS version of logistic regression, the data is properly
> standardized, so this should not happen. Can you provide a copy of
> your dataset to us so we can test it? If the dataset can not be
> public, can you have just send me a copy so I can dig into this? I'm
> the author of LORWithLBFGS. Thanks.
>
> Sincerely,
>
> DB Tsai
> -------------------------------------------------------
> Blog: https://www.dbtsai.com
>
>
> On Fri, Mar 13, 2015 at 2:41 PM, cjwang <cj@cjwang.us> wrote:
>
> I am running LogisticRegressionWithLBFGS.  I got these lines on my console:
>
> 2015-03-12 17:38:03,897 ERROR breeze.optimize.StrongWolfeLineSearch |
> Encountered bad values in function evaluation. Decreasing step size to 0.5
> 2015-03-12 17:38:03,967 ERROR breeze.optimize.StrongWolfeLineSearch |
> Encountered bad values in function evaluation. Decreasing step size to 0.25
> 2015-03-12 17:38:04,036 ERROR breeze.optimize.StrongWolfeLineSearch |
> Encountered bad values in function evaluation. Decreasing step size to 0.125
> 2015-03-12 17:38:04,105 ERROR breeze.optimize.StrongWolfeLineSearch |
> Encountered bad values in function evaluation. Decreasing step size to
> 0.0625
> 2015-03-12 17:38:04,176 ERROR breeze.optimize.StrongWolfeLineSearch |
> Encountered bad values in function evaluation. Decreasing step size to
> 0.03125
> 2015-03-12 17:38:04,247 ERROR breeze.optimize.StrongWolfeLineSearch |
> Encountered bad values in function evaluation. Decreasing step size to
> 0.015625
> 2015-03-12 17:38:04,317 ERROR breeze.optimize.StrongWolfeLineSearch |
> Encountered bad values in function evaluation. Decreasing step size to
> 0.0078125
> 2015-03-12 17:38:04,461 ERROR breeze.optimize.StrongWolfeLineSearch |
> Encountered bad values in function evaluation. Decreasing step size to
> 0.005859375
> 2015-03-12 17:38:04,605 INFO breeze.optimize.StrongWolfeLineSearch | Line
> search t: NaN fval: NaN rhs: NaN cdd: NaN
> 2015-03-12 17:38:04,672 INFO breeze.optimize.StrongWolfeLineSearch | Line
> search t: NaN fval: NaN rhs: NaN cdd: NaN
> 2015-03-12 17:38:04,747 INFO breeze.optimize.StrongWolfeLineSearch | Line
> search t: NaN fval: NaN rhs: NaN cdd: NaN
> 2015-03-12 17:38:04,818 INFO breeze.optimize.StrongWolfeLineSearch | Line
> search t: NaN fval: NaN rhs: NaN cdd: NaN
> 2015-03-12 17:38:04,890 INFO breeze.optimize.StrongWolfeLineSearch | Line
> search t: NaN fval: NaN rhs: NaN cdd: NaN
> 2015-03-12 17:38:04,962 INFO breeze.optimize.StrongWolfeLineSearch | Line
> search t: NaN fval: NaN rhs: NaN cdd: NaN
> 2015-03-12 17:38:05,038 INFO breeze.optimize.StrongWolfeLineSearch | Line
> search t: NaN fval: NaN rhs: NaN cdd: NaN
> 2015-03-12 17:38:05,107 INFO breeze.optimize.StrongWolfeLineSearch | Line
> search t: NaN fval: NaN rhs: NaN cdd: NaN
> 2015-03-12 17:38:05,186 INFO breeze.optimize.StrongWolfeLineSearch | Line
> search t: NaN fval: NaN rhs: NaN cdd: NaN
> 2015-03-12 17:38:05,256 INFO breeze.optimize.StrongWolfeLineSearch | Line
> search t: NaN fval: NaN rhs: NaN cdd: NaN
> 2015-03-12 17:38:05,257 ERROR breeze.optimize.LBFGS | Failure! Resetting
> history: breeze.optimize.FirstOrderException: Line search zoom failed
>
>
> What causes them and how do I fix them?
>
> I checked my data and there seemed nothing out of the ordinary.  The
> resulting prediction model seemed acceptable to me.  So, are these ERRORs
> actually WARNINGs?  Could we or should we tune the level of these messages
> down one notch?
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/LogisticRegressionWithLBFGS-shows-ERRORs-tp22042.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
> For additional commands, e-mail: user-help@spark.apache.org
>
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org


Mime
View raw message