mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Prabhu" <>
Subject RE: Logistic Regression in Mahout
Date Thu, 31 Jan 2013 03:15:12 GMT
Thanks, I thought of that, but that doesn't seem to be the right explanation
For one, in the output I see the equation like
TargetVariable ~ -0.001*InterceptTerm + - 0.0006*predictor1 +
-0.0004*predictor2 ....

Also if I look at the say predictor1, the co-efficient in R is 1.02 and for
predictor2 is 0.48 whereas in Mahout, I get -0.00063 for predictor1 and
-0.00042 for predictor2. Now if these values are logs of what I am looking
for, e^ -0.00063 is 0.999937 and e^ -0.00042 is 0.99958, so the difference
is marginal, whereas R co-efficients indicate predictor1 has much higher
weightage compared to predictor2 which is what I would expect.

Any other thoughts, ideas?


-----Original Message-----
From: Jake Mannix [] 
Sent: 31 January 2013 04:54
Subject: Re: Logistic Regression in Mahout

Looks like you're looking at weights which are logs of the weights you think
you want.

On Wed, Jan 30, 2013 at 4:12 AM, Prabhu <> wrote:

> Hi all,
>     I am trying to use Mahout to run logistic regression analysis on 
> some data. The data is about 7 Million rows, with about 20 predictor 
> variables (all of them numeric).  The target variable is Boolean - 0 or 1.
> I run a logistic regression with this data on R and I get good 
> co-efficients which makes sense. But when I  run a logistic regression 
> on the exact same data using Mahout, I get co-efficients that don't 
> make sense. For a start, all co-efficients are negative. The 
> interesting thing is that the co-efficient (from R) for the most 
> important variable (with highest
> co-efficient) has the least negative value in Mahout. Can someone 
> please help me understand what the cause of the problem is?
> Thanks
> Prabhu



View raw message