spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chang-Jia Wang ...@cjwang.us>
Subject Re: LogisticRegressionWithLBFGS shows ERRORs
Date Mon, 16 Mar 2015 19:48:01 GMT
I just used random numbers.

(My ML lib was spark-mllib_2.10-1.2.1)

Please see the attached log.  In the middle of the log, I dumped the data set before feeding
into LogisticRegressionWithLBFGS.  The first column false/true was the label (attribute “a”),
and columns 2-5 (attributes “x”, “y”, “z”, and “i”) were the features.  The
6th column was just row ID and was not used.

The relationship was arbitrarily: a = (0.3 * x + 0.5 * y - 0.2 *z > 0.4)

After that you can find LBFGS was doing its job and then pumped out the error messages.

The model showed coefficients:

396.57624765427323, x
662.7969020937115, y
-259.0975519038385, z
12.352037503257826, i
-538.8516249699426, @a

The last one was the intercept.  As you can see, the model seemed close enough. 

After that I fed the same data back to the model to see how the predictions worked.   (here
attribute “a” was the prediction and “aa” was the original label)  I only displayed
20 rows.

The error rate showed 2 errors out of 1000.

count(INTEGER), errorRate(DOUBLE), countDiff(INTEGER)
key=[], rows=1
1000, 0.0020000000949949026, 2

So, the algorithm worked, just spitting out the errors was kind of annoying.  If this is not
result affecting, maybe it should be warning or info.

C.J.







On Mar 15, 2015, at 12:42 AM, DB Tsai <dbtsai@dbtsai.com> wrote:

> In LBFGS version of logistic regression, the data is properly
> standardized, so this should not happen. Can you provide a copy of
> your dataset to us so we can test it? If the dataset can not be
> public, can you have just send me a copy so I can dig into this? I'm
> the author of LORWithLBFGS. Thanks.
> 
> Sincerely,
> 
> DB Tsai
> -------------------------------------------------------
> Blog: https://www.dbtsai.com
> 
> 
> On Fri, Mar 13, 2015 at 2:41 PM, cjwang <cj@cjwang.us> wrote:
>> I am running LogisticRegressionWithLBFGS.  I got these lines on my console:
>> 
>> 2015-03-12 17:38:03,897 ERROR breeze.optimize.StrongWolfeLineSearch |
>> Encountered bad values in function evaluation. Decreasing step size to 0.5
>> 2015-03-12 17:38:03,967 ERROR breeze.optimize.StrongWolfeLineSearch |
>> Encountered bad values in function evaluation. Decreasing step size to 0.25
>> 2015-03-12 17:38:04,036 ERROR breeze.optimize.StrongWolfeLineSearch |
>> Encountered bad values in function evaluation. Decreasing step size to 0.125
>> 2015-03-12 17:38:04,105 ERROR breeze.optimize.StrongWolfeLineSearch |
>> Encountered bad values in function evaluation. Decreasing step size to
>> 0.0625
>> 2015-03-12 17:38:04,176 ERROR breeze.optimize.StrongWolfeLineSearch |
>> Encountered bad values in function evaluation. Decreasing step size to
>> 0.03125
>> 2015-03-12 17:38:04,247 ERROR breeze.optimize.StrongWolfeLineSearch |
>> Encountered bad values in function evaluation. Decreasing step size to
>> 0.015625
>> 2015-03-12 17:38:04,317 ERROR breeze.optimize.StrongWolfeLineSearch |
>> Encountered bad values in function evaluation. Decreasing step size to
>> 0.0078125
>> 2015-03-12 17:38:04,461 ERROR breeze.optimize.StrongWolfeLineSearch |
>> Encountered bad values in function evaluation. Decreasing step size to
>> 0.005859375
>> 2015-03-12 17:38:04,605 INFO breeze.optimize.StrongWolfeLineSearch | Line
>> search t: NaN fval: NaN rhs: NaN cdd: NaN
>> 2015-03-12 17:38:04,672 INFO breeze.optimize.StrongWolfeLineSearch | Line
>> search t: NaN fval: NaN rhs: NaN cdd: NaN
>> 2015-03-12 17:38:04,747 INFO breeze.optimize.StrongWolfeLineSearch | Line
>> search t: NaN fval: NaN rhs: NaN cdd: NaN
>> 2015-03-12 17:38:04,818 INFO breeze.optimize.StrongWolfeLineSearch | Line
>> search t: NaN fval: NaN rhs: NaN cdd: NaN
>> 2015-03-12 17:38:04,890 INFO breeze.optimize.StrongWolfeLineSearch | Line
>> search t: NaN fval: NaN rhs: NaN cdd: NaN
>> 2015-03-12 17:38:04,962 INFO breeze.optimize.StrongWolfeLineSearch | Line
>> search t: NaN fval: NaN rhs: NaN cdd: NaN
>> 2015-03-12 17:38:05,038 INFO breeze.optimize.StrongWolfeLineSearch | Line
>> search t: NaN fval: NaN rhs: NaN cdd: NaN
>> 2015-03-12 17:38:05,107 INFO breeze.optimize.StrongWolfeLineSearch | Line
>> search t: NaN fval: NaN rhs: NaN cdd: NaN
>> 2015-03-12 17:38:05,186 INFO breeze.optimize.StrongWolfeLineSearch | Line
>> search t: NaN fval: NaN rhs: NaN cdd: NaN
>> 2015-03-12 17:38:05,256 INFO breeze.optimize.StrongWolfeLineSearch | Line
>> search t: NaN fval: NaN rhs: NaN cdd: NaN
>> 2015-03-12 17:38:05,257 ERROR breeze.optimize.LBFGS | Failure! Resetting
>> history: breeze.optimize.FirstOrderException: Line search zoom failed
>> 
>> 
>> What causes them and how do I fix them?
>> 
>> I checked my data and there seemed nothing out of the ordinary.  The
>> resulting prediction model seemed acceptable to me.  So, are these ERRORs
>> actually WARNINGs?  Could we or should we tune the level of these messages
>> down one notch?
>> 
>> 
>> 
>> --
>> View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/LogisticRegressionWithLBFGS-shows-ERRORs-tp22042.html
>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
>> For additional commands, e-mail: user-help@spark.apache.org
>> 


Mime
View raw message