mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Svetlomir Kasabov <skasa...@smail.inf.fh-brs.de>
Subject Re: Logistic Regression + Time Series
Date Tue, 14 Jun 2011 14:14:02 GMT
Many thanks for the replies to all of you!

Ok, now I have developed a vague concept how to train Mahout's 
OnlineLogisticRegression moded using times series (correct me if you 
detect some issue):

Given the following observations for patient 1, where a predictor is 
'Heart Rate' and a target variable is 'State':

Hour |  Heart Rate (mean) | State
-----------------------------------------------
1.      | 90                            | stable
2.      | 92                            | stable
3.      | 94                            | stable
4.      | 98                            | stable
5       | 100                          | instable

I want to train Mahout to predict the 'State' from 1 hour in the future 
(future window), based on the data from 1 hour in the past (past 
window). We assume we are in hour number 2 from the table. We should 
take 'Heart Rate' (or some other deltas, derived from heart rates) from 
hour 1 and the 'State' from hour 3 in order to create a training 
example. The next training example will be  with 'Heart Rate' from hour 
2 and the 'State' from hour 4. And so on.

My question is: how does Mahout discover the 'time'-aspect of the 
training: won't I achieve the same result when I swap the training 
examples ? Am I missing something ? Are there other issues in the concept?

Thanks and best regards,

Svetlomir.


Am 06.06.2011 22:30, schrieb Josh Patterson:
> I've done a bit of time series data mining with Hadoop; I've written
> up some basics on time series and map reduce at our blog:
>
> http://www.cloudera.com/blog/2011/03/simple-moving-average-secondary-sort-and-mapreduce-part-1/
> http://www.cloudera.com/blog/2011/03/simple-moving-average-secondary-sort-and-mapreduce-part-2/
> http://www.cloudera.com/blog/2011/04/simple-moving-average-secondary-sort-and-mapreduce-part-3/
>
> while these articles wont help you on the LR end of things, it does
> give you working code on github to work from as a basis wrt time
> series and secondary sort (and sliding window).
>
> Josh
>
> On Sun, Jun 5, 2011 at 10:08 AM, Svetlomir Dimitrov Kasabov
> <svetlomir.kasabov@smail.inf.fh-bonn-rhein-sieg.de>  wrote:
>> Hello,
>>
>> I plan using Apache Mahout's Logistic Regression (LR) implementation in my
>> Master-Thesis. We plan using time series in order to predict, whether a
>> particular patient will have an instable blood flow soon or not. Thats's why
>> I want to ask you if it is possible to use Mahout in connection with time
>> series ? Do you see any potential problems / risks ?
>>
>> Many thanks and best regards!
>>
>> Svetlomir Kasabov.
>>
>>
>>
>> --
>> Svetlomir Dimitrov Kasabov
>>
>> ----------------------------------------------------------------
>> This message was sent using IMP, the Internet Messaging Program.
>>
>>
>
>


Mime
View raw message