mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Luis Carlos Guerrero Covo <lcguerreroc...@gmail.com>
Subject Re: Setting up a recommender
Date Fri, 19 Jul 2013 22:39:09 GMT
I'm currently working for a portal that has a similar use case and I was
thinking of implementing this in a similar way. I'm generating
recommendations using python scripts based on similarity measures (content
based recommendation) only using euclidean distance and some weights for
each attribute. I want to use mahout's GenericItemBasedRecommender to
generate these same recommendations without user data (no tracking right
now of user to item relationship). I was thinking of pushing the generated
recommendations to solr using atomic updates since my fields are all stored
right now. Since this is very similar to what I'm trying to accomplish, I
would sign up to collaborate in any way I can since I'm fairly familiar
with solr and I'm starting to learn my way around mahout.


On Fri, Jul 19, 2013 at 5:12 PM, Sebastian Schelter <ssc@apache.org> wrote:

> I would also be willing to provide guidance and advice for anyone taking
> this on, I can especially help with the offline analysis part.
>
> --sebastian
>
>
> 2013/7/19 Ted Dunning <ted.dunning@gmail.com>
>
> > I would be happy to supervise a project to implement a demo of this if
> > anybody is willing to do the grunt work of gluing things together.
> >
> > Sooo, if you would like to work on this, here is a suggested project.
> >
> > This project would entail:
> >
> > a) build a synthetic data source
> >
> > b) write scripts to do the off-line analysis
> >
> > c) write scripts to export to Solr
> >
> > d) write a very quick web facade over Solr to make it look like a
> > recommendation engine.  This would include
> >
> >   d.1) a "most popular page" that does combined popularity rise and
> > recommendation
> >
> >   d.2) a "personal recommendation page" that does just recommendation
> with
> > dithering
> >
> >   d.3) item pages with "related items" at the bottom
> >
> > e) work with others to provide high quality system walk-through and
> install
> > directions
> >
> > If you want to bite on this, we should arrange a weekly video hangout.  I
> > am willing to commit to guiding and providing detailed technical
> > approaches.  You should be willing to commit to actually doing stuff.
> >
> > The goal would be to provide a fully worked out scaffolding of a
> practical
> > recommendation system that presumably would become an example module in
> > Mahout.
> >
> >
> > On Fri, Jul 19, 2013 at 1:08 PM, B Lyon <bradflyon@gmail.com> wrote:
> >
> > > +1 as well.  Sounds fun.
> > >
> > > On Fri, Jul 19, 2013 at 4:06 PM, Dominik Hübner <contact@dhuebner.com
> > > >wrote:
> > >
> > > > +1 for getting something like that in a future release of Mahout
> > > >
> > > > On Jul 19, 2013, at 10:02 PM, Sebastian Schelter <ssc@apache.org>
> > wrote:
> > > >
> > > > > It would be awesome if we could get a nice, easily deployable
> > > > > implementation of that approach into Mahout before 1.0
> > > > >
> > > > >
> > > > > 2013/7/19 Ted Dunning <ted.dunning@gmail.com>
> > > > >
> > > > >> My current advice is to use Hadoop (if necessary) to build a
> sparse
> > > > >> item-item matrix based on each kind of behavior you have and
then
> > drop
> > > > >> those similarities into a search engine to deliver the actual
> > > > >> recommendations.  This allows lots of flexibility in terms of
> which
> > > > kinds
> > > > >> of inputs you use for the recommendation and lets you blend
> > > > recommendations
> > > > >> with search and geo-location.
> > > > >>
> > > > >>
> > > > >> On Fri, Jul 19, 2013 at 12:33 PM, Helder Martins <
> > > > >> helder.garay@corp.terra.com.br> wrote:
> > > > >>
> > > > >>> Hi,
> > > > >>> I'm a dev working for a web portal in Brazil and I'm particularly
> > > > >>> interested in building a item-based collaborative filtering
> > > recommender
> > > > >>> for our database of news articles.
> > > > >>> After some coding, I was able to get some recommendations
using a
> > > > >>> GenericItemBasedRecommender, a CassandraDataModel and some
custom
> > > > >>> classes that store item similarities and migrated item IDs
into
> > > > >>> Cassandra. But know I'm in doubt of what is normally done
with
> this
> > > > >>> recommender: Should I run this as a daemon, cache the
> > recommendations
> > > > >>> into memory and set up a web service to consult it online?
> Should I
> > > pre
> > > > >>> process these recommendations for each recent user and store
it
> > > > >>> somewhere? My first idea was storing all these recs back
into
> > > > Cassandra,
> > > > >>> but looking into some classes it seems to me that the norm
is to
> > read
> > > > >>> the input data and store the output always using files. Is
this a
> > > > common
> > > > >>> practice that benefits from HDFS?
> > > > >>> My use case here is something around 70k recommendations
requests
> > per
> > > > >>> second.
> > > > >>>
> > > > >>> Thanks in advance,
> > > > >>>
> > > > >>> --
> > > > >>>
> > > > >>> Atenciosamente
> > > > >>> Helder Martins
> > > > >>> Arquitetura do Portal e Sistemas de Backend
> > > > >>> +55 (51) 3284-4475
> > > > >>> Terra
> > > > >>>
> > > > >>>
> > > > >>> Esta mensagem e seus anexos se dirigem exclusivamente ao
seu
> > > > >> destinatário,
> > > > >>> podem conter informação privilegiada ou confidencial e
são de uso
> > > > >> exclusivo
> > > > >>> da pessoa ou entidade de destino. Se não for destinatário
desta
> > > > mensagem,
> > > > >>> fica notificado de que a leitura, utilização, divulgação
e/ou
> cópia
> > > sem
> > > > >>> autorização pode estar proibida em virtude da legislação
vigente.
> > Se
> > > > >>> recebeu esta mensagem por engano, pedimos que nos o comunique
> > > > >> imediatamente
> > > > >>> por esta mesma via e, em seguida, apague-a.
> > > > >>>
> > > > >>> Este mensaje y sus adjuntos se dirigen exclusivamente a su
> > > > destinatario,
> > > > >>> puede contener información privilegiada o confidencial y
es para
> > uso
> > > > >>> exclusivo de la persona o entidad de destino. Si no es usted
él
> > > > >>> destinatario indicado, queda notificado de que la lectura,
> > > utilización,
> > > > >>> divulgación y/o copia sin autorización puede estar prohibida
en
> > > virtud
> > > > de
> > > > >>> la legislación vigente. Si ha recibido este mensaje por
error, le
> > > > pedimos
> > > > >>> que nos lo comunique inmediatamente por esta misma vía y
proceda
> a
> > su
> > > > >>> exclusión.
> > > > >>>
> > > > >>> The information contained in this transmissión is privileged
and
> > > > >>> confidential information intended only for the use of the
> > individual
> > > or
> > > > >>> entity named above. If the reader of this message is not
the
> > intended
> > > > >>> recipient, you are hereby notified that any dissemination,
> > > distribution
> > > > >> or
> > > > >>> copying of this communication is strictly prohibited. If
you have
> > > > >> received
> > > > >>> this transmission in error, do not read it. Please immediately
> > reply
> > > to
> > > > >> the
> > > > >>> sender that you have received this communication in error
and
> then
> > > > delete
> > > > >>> it.
> > > > >>>
> > > > >>
> > > >
> > > >
> > >
> > >
> > > --
> > > BF Lyon
> > > http://www.nowherenearithaca.com
> > >
> >
>



-- 
Luis Carlos Guerrero Covo
M.S. Computer Engineering
(57) 3183542047

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message