mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bertrand Dechoux <decho...@gmail.com>
Subject Re: Information
Date Wed, 16 Oct 2013 09:02:54 GMT
That's why I was asking a bit more about the problem. It looks to me that
what will bring more value at the beginning is to find the shortest path,
which is a classical graph algorithm. Then the results could be improved by
changing the speed of each route according to additional information. As a
client, if it's raining, I only want to know if I should turn left or
right. Estimating the speed of each route with a good enough accuracy is
more complex and is relevant only if there is a single long enough route.

If you are dealing with large volume of data, there are also graph
solutions for Hadoop like Giraph or Hama.

IMHO, YMMV...

Bertrand




On Tue, Oct 15, 2013 at 10:01 PM, Angelo Immediata <angeloimm@gmail.com>wrote:

> hi All
>
> First of all thank you for the great suggestions you gave me; you are
> simply great :)
> Anyway, returning to my problem, I'll try to be as much clear as
> possible...As far as I know (but we are still collecting requirements and
> understanding which kind of data we will have) we should have a situation
> of this type:
> on street XYZ in Spring without any events (an event can be manifestation,
> parade etc...) the medium velocity is 50 Km/h
> on street XYZ in Spring with an event the medium velocity is 20 Km/h
> on street XYZ in Autumn without any events (an event can be manifestation,
> parade etc...) the medium velocity is 40 Km/h
> on street XYZ in Autumn with an event the medium velocity is 15 Km/h
>
> and so on for all the interested street (basically using the Open Street
> Map data); note that we are not interested in the worst case that is the
> case with accident (at least as far as I know).
>
> Now my customer would like to offer this kind functionality to the clients:
> a client connects to the site (or downloads an app) and he/she wants to go
> by car to the restaurant W; he/she would like to know if it's a good idea
> to go on that street or search for a different street; so by knowing the
> period of time (Spring, Autumn, Summer or Winter) and by knowing if there
> are some events (manifestations, parades etc...) I should tell him/her: if
> you go on street XYZ probably you will travel at 50Km/h or 20Km/h (the best
> would be if I may suggest a different way...but this is another topic :) )
>
> So, since i should use old data in order to suggest to the client the
> velocity he/she may have on street XYZ, I was thinking to use mahout....but
> maybe I was wrong (sadly I'm really new in this kind of world...though I'm
> finding it amazing)
>
>
> Now by using the "old" data (the one I listed previously)
>
>
>
> 2013/10/15 Andrew Butkus <andrew@butkus.co.uk>
>
> >
> > After giving some more thought, you could do something like this:
> >
> > Store:
> >
> > route
> > {
> >         road
> >         {
> >                 timestamp,
> >                 time_to_run_road,
> >         }
> > }
> >
> > then build up a bigger model, which extracts timestamp from the road on
> > the route and the time it takes to run that road, and calculate an
> average
> > on a per day basis, (for example, if you travel this route every monday
> at
> > 9am, then extract the timestamp which matches every monday at 9am, and
> > average the time_to_run_road data you have collected on a monday for that
> > road. If you want to see how long it takes to run a road on every monday
> at
> > 9am in january, then you extract all timestamps that match that road for
> > january at 9am on monday
> >
> > Not entirely sure where mahout fits in here, but this could be a
> potential
> > way forward for you (assuming you can collect/have data about the road)
> >
> > Hope that helps
> >
> > Andy
> >
> > On 15 Oct 2013, at 13:09, Andrew Butkus <andrew@butkus.co.uk> wrote:
> >
> > > Also to add to this you probably wouldn't want to do it by route, but
> > > maybe break it down by road, this gives more coverage and greater
> > > granularity
> > >
> > > Sent from my Windows Phone From: Andrew Butkus
> > > Sent: 15/10/2013 13:07
> > > To: Bertrand Dechoux; user@mahout.apache.org
> > > Subject: RE: Information
> > > IM not sure, i think the last 2 can be predicted, for example in
> > > january in the uk we get bad weather which causes delays and on average
> > > it will take longer to run a route in this month because of that,
> > >
> > > To consider weather as a variable is probably not scalable, recording
> > > the time to run a route with a timestamp should be good enough.
> > >
> > > Also consider once a year there is a festival in reading, so over this
> > > weekend routes through reading will always take longer.
> > >
> > > IM not sure where mahout can fit this problem, other than, but if u can
> > > train route time and add a timestamp this would give u something
> > > scalable. Then figure out on average how long it takes to run a route
> > > at similar time stamp, for example, minute, hour, week, month, year.
> > >
> > > Sent from my Windows Phone From: Bertrand Dechoux
> > > Sent: 15/10/2013 08:33
> > > To: user@mahout.apache.org
> > > Subject: Re: Information
> > > The biggest point is what data do you have and what exactly is your
> > problem.
> > >
> > > The maximum speed of the route can be easily known and in the best case
> > > that would be your speed. From a very broad point of view, there is
> three
> > > reasons for a slowdown.
> > > 1) traffic jam
> > > 2) accident
> > > 3) bad weather
> > >
> > > But without up to date observations, those three points are non trivial
> > to
> > > predict (especially the last two). Doing simple statistics (like
> average)
> > > can be a good start to see the variations and understand what factors
> > > should be taken into account.
> > >
> > > At the end, you want to do a regression but classification and
> clustering
> > > might help before that. Hard to say more without knowing why the medium
> > > speed is important, for which area, at which time...
> > >
> > > Bertrand
> > >
> > > On Tue, Oct 15, 2013 at 9:14 AM, Pavan K Narayanan <
> > > pavan.narayanan@gmail.com> wrote:
> > >
> > >> Based on the information you have provided, street routing is
> > potentially a
> > >> Vehicle Routing Problem which is based on TSPs. You can check out the
> > below
> > >> link:
> > >> https://cwiki.apache.org/confluence/display/MAHOUT/Traveling+Salesman
> > >> Secondly, if you want to use Mahout for Forecasting, it is not
> possible
> > yet
> > >> as the solution methodology for Forecasting (LWR) is still an open
> > problem.
> > >> https://cwiki.apache.org/confluence/display/MAHOUT/Algorithms
> > >>
> > >> Bottomline: IMHO, you cannot use Mahout for forecasting at the moment;
> > good
> > >> luck with your project.
> > >>
> > >> Also, you can explore parallel computing paradigms if you have
> > relatively
> > >> high volumes of data.
> > >>
> > >>
> > >> On 15 October 2013 12:19, Angelo Immediata <angeloimm@gmail.com>
> wrote:
> > >>
> > >>> Hi there
> > >>>
> > >>> I'm pretty new to learning machine and apache mahout as well so
> pardon
> > me
> > >>> if this question is not too correct :)
> > >>>
> > >>> I'm in a street routing project where, beside other functionalities,
> we
> > >>> have to make forecasts. Precisely we should be able in forecasting
> the
> > >>> medium speed in a street in a well know period season (e.g we should
> be
> > >>> able in answering to this kind of question: on the american route 66
> > what
> > >>> will be the medium speed in spring 2015?)
> > >>> As far as I know in order to offer this functionality we should use
> > some
> > >>> learning machine; this is the reason I'm checking mahout (moreover
we
> > >> need
> > >>> to guarantee high performance and since mahout is based on Apache
> > hadoop
> > >>> and since it uses Map/Reduce, it seems to me very amazing)
> > >>> The first question I'ld love to do is: can I use Apache mahout in
> order
> > >> to
> > >>> implement the previously written funcionality?
> > >>> If I can use it sure I'll need some data in order to "train"
> > >> mahout....can
> > >>> I train mahout in a different time respect to when i need the
> > prevision?
> > >> I
> > >>> mean: can I make the train let's say every week at 10pm and then
> offer
> > >> the
> > >>> forecasting functionality only when a user is interested in it?
> Should
> > I
> > >>> store the training result in some way?
> > >>> And the last, but not the least :), always if I can use
> mahout....which
> > >>> algoritm should I use in order to implement my scenario?
> > >>>
> > >>> Thank you for the help and pardon me if i was not too much corrected
> > >>>
> > >>
> >
> >
>



-- 
Bertrand Dechoux

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message