Thanks, I thought of that, but that doesn't seem to be the right explanation
either
For one, in the output I see the equation like
TargetVariable ~ 0.001*InterceptTerm +  0.0006*predictor1 +
0.0004*predictor2 ....
Also if I look at the say predictor1, the coefficient in R is 1.02 and for
predictor2 is 0.48 whereas in Mahout, I get 0.00063 for predictor1 and
0.00042 for predictor2. Now if these values are logs of what I am looking
for, e^ 0.00063 is 0.999937 and e^ 0.00042 is 0.99958, so the difference
is marginal, whereas R coefficients indicate predictor1 has much higher
weightage compared to predictor2 which is what I would expect.
Any other thoughts, ideas?
Thanks
Prabhu
Original Message
From: Jake Mannix [mailto:jake.mannix@gmail.com]
Sent: 31 January 2013 04:54
To: user@mahout.apache.org
Subject: Re: Logistic Regression in Mahout
Looks like you're looking at weights which are logs of the weights you think
you want.
On Wed, Jan 30, 2013 at 4:12 AM, Prabhu <prabhu@mediaiqdigital.com> wrote:
> Hi all,
>
> I am trying to use Mahout to run logistic regression analysis on
> some data. The data is about 7 Million rows, with about 20 predictor
> variables (all of them numeric). The target variable is Boolean  0 or 1.
>
> I run a logistic regression with this data on R and I get good
> coefficients which makes sense. But when I run a logistic regression
> on the exact same data using Mahout, I get coefficients that don't
> make sense. For a start, all coefficients are negative. The
> interesting thing is that the coefficient (from R) for the most
> important variable (with highest
> coefficient) has the least negative value in Mahout. Can someone
> please help me understand what the cause of the problem is?
>
>
>
> Thanks
>
> Prabhu
>
>
>
>

jake
