mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rajat Banerjee <>
Subject Looking for Linear Regression on Hadoop
Date Mon, 07 Dec 2009 23:04:55 GMT
Dear Apache Community,
I am looking to perform a linear regression on a rather large amount
of data in my hadoop cluster. It is part of my master's thesis at
harvard university.

After perusing the docs on the Mahout site, it seems like the
following algorithms havent been implemented yet-
Locally-Weighted Linear Regression
Linear Regression
Logistic Regression

Basically, there is a stock market phenomenon which I'm trying to
predict. It is called a short squeeze. I have about 16,000 data points
- stocks and a point in time where the phenomenon has occurred. I'm
trying to develop a predictive model in a hadoop cluster.

The accuracy of the model doesn't matter much at this point, the goal
and what would make my prof happy is to see the cluster grinding away,
doing some relevant but perhaps not totally correct mathematical
operations. Read: If its a linear regression i'll be happy, but if it
isn't possible yet I dont mind.

Can anyone suggest something I can use? I've downloaded Mahout 0.2 and
searched through it, but nothing for performing regressions has jumped
out at me.
Thank you.

View raw message