spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joseph PENG <josephtengp...@gmail.com>
Subject Re: GLM Poisson Model - Deviance calculations
Date Thu, 19 Apr 2018 01:25:05 GMT
Are you referring this?

   override def deviance(y: Double, mu: Double, weight: Double): Double = {
      2.0 * weight * (y * math.*log(y / mu)* - (y - mu))
    }

Not sure how does R handle this, but my guess is they may add a small
number, e.g. 0.5, to the numerator and denominator. If you can confirm
that's the issue, I will look into it.

On Wed, Apr 18, 2018 at 6:46 PM, Sean Owen <srowen@gmail.com> wrote:

> GeneralizedLinearRegression.ylogy seems to handle this case; can you be
> more specific about where the log(0) happens? that's what should be fixed,
> right? if so, then a JIRA and PR are the right way to proceed.
>
> On Wed, Apr 18, 2018 at 2:37 PM svattig <srikar.vattigunta@gmail.com>
> wrote:
>
>> In Spark 2.3, When Poisson Model(with labelCol having few counts as 0's)
>> is
>> fit, the Deviance calculations are broken as result of log(0). I think
>> this
>> is the same case as in spark 2.2.
>> But the new toString method in Spark 2.3's
>> GeneralizedLinearRegressionTrainingSummary class is throwing error at
>> line
>> 1551 with NumberFormatException. Due to this exception, we are not able to
>> get the summary object from Model fit.
>>
>> Can the toString method be fixed including Deviance calculations for
>> example
>> taking log(1) when ever the count is 0 instead of having log(0) ?
>>
>> Thanks,
>> Srikar.V
>>
>>
>>
>> --
>> Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/
>>
>> ---------------------------------------------------------------------
>> To unsubscribe e-mail: dev-unsubscribe@spark.apache.org
>>
>>

Mime
View raw message