mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Manuel Blechschmidt <Manuel.Blechschm...@gmx.de>
Subject Re: Setting up a recommender
Date Sat, 20 Jul 2013 10:27:22 GMT
Hello,
if there is a high demand for this functionality my company (http://www.apaxo.de/us/recitems.html)
could implement this. Nevertheless we can't do it for free. So if it is possible to get a
shared budget from everybody who is interested in this then it would be possible to write
it.

The codehaus JIRA has an incentive functionality:
https://secure.donay.com/site/index

Perhaps this might also be useful for the Mahout (a.k.a. Apache) JIRA.

/Manuel

Am 20.07.2013 um 00:45 schrieb Ted Dunning:

> OK.  I think the crux here is the off-line to Solr part so let's see who
> else pops up.
> 
> Having a solr maven could be very helpful.
> 
> 
> On Fri, Jul 19, 2013 at 3:39 PM, Luis Carlos Guerrero Covo <
> lcguerrerocovo@gmail.com> wrote:
> 
>> I'm currently working for a portal that has a similar use case and I was
>> thinking of implementing this in a similar way. I'm generating
>> recommendations using python scripts based on similarity measures (content
>> based recommendation) only using euclidean distance and some weights for
>> each attribute. I want to use mahout's GenericItemBasedRecommender to
>> generate these same recommendations without user data (no tracking right
>> now of user to item relationship). I was thinking of pushing the generated
>> recommendations to solr using atomic updates since my fields are all stored
>> right now. Since this is very similar to what I'm trying to accomplish, I
>> would sign up to collaborate in any way I can since I'm fairly familiar
>> with solr and I'm starting to learn my way around mahout.
>> 
>> 
>> On Fri, Jul 19, 2013 at 5:12 PM, Sebastian Schelter <ssc@apache.org>
>> wrote:
>> 
>>> I would also be willing to provide guidance and advice for anyone taking
>>> this on, I can especially help with the offline analysis part.
>>> 
>>> --sebastian
>>> 
>>> 
>>> 2013/7/19 Ted Dunning <ted.dunning@gmail.com>
>>> 
>>>> I would be happy to supervise a project to implement a demo of this if
>>>> anybody is willing to do the grunt work of gluing things together.
>>>> 
>>>> Sooo, if you would like to work on this, here is a suggested project.
>>>> 
>>>> This project would entail:
>>>> 
>>>> a) build a synthetic data source
>>>> 
>>>> b) write scripts to do the off-line analysis
>>>> 
>>>> c) write scripts to export to Solr
>>>> 
>>>> d) write a very quick web facade over Solr to make it look like a
>>>> recommendation engine.  This would include
>>>> 
>>>>  d.1) a "most popular page" that does combined popularity rise and
>>>> recommendation
>>>> 
>>>>  d.2) a "personal recommendation page" that does just recommendation
>>> with
>>>> dithering
>>>> 
>>>>  d.3) item pages with "related items" at the bottom
>>>> 
>>>> e) work with others to provide high quality system walk-through and
>>> install
>>>> directions
>>>> 
>>>> If you want to bite on this, we should arrange a weekly video hangout.
>> I
>>>> am willing to commit to guiding and providing detailed technical
>>>> approaches.  You should be willing to commit to actually doing stuff.
>>>> 
>>>> The goal would be to provide a fully worked out scaffolding of a
>>> practical
>>>> recommendation system that presumably would become an example module in
>>>> Mahout.
>>>> 
>>>> 
>>>> On Fri, Jul 19, 2013 at 1:08 PM, B Lyon <bradflyon@gmail.com> wrote:
>>>> 
>>>>> +1 as well.  Sounds fun.
>>>>> 
>>>>> On Fri, Jul 19, 2013 at 4:06 PM, Dominik Hübner <
>> contact@dhuebner.com
>>>>>> wrote:
>>>>> 
>>>>>> +1 for getting something like that in a future release of Mahout
>>>>>> 
>>>>>> On Jul 19, 2013, at 10:02 PM, Sebastian Schelter <ssc@apache.org>
>>>> wrote:
>>>>>> 
>>>>>>> It would be awesome if we could get a nice, easily deployable
>>>>>>> implementation of that approach into Mahout before 1.0
>>>>>>> 
>>>>>>> 
>>>>>>> 2013/7/19 Ted Dunning <ted.dunning@gmail.com>
>>>>>>> 
>>>>>>>> My current advice is to use Hadoop (if necessary) to build
a
>>> sparse
>>>>>>>> item-item matrix based on each kind of behavior you have
and
>> then
>>>> drop
>>>>>>>> those similarities into a search engine to deliver the actual
>>>>>>>> recommendations.  This allows lots of flexibility in terms
of
>>> which
>>>>>> kinds
>>>>>>>> of inputs you use for the recommendation and lets you blend
>>>>>> recommendations
>>>>>>>> with search and geo-location.
>>>>>>>> 
>>>>>>>> 
>>>>>>>> On Fri, Jul 19, 2013 at 12:33 PM, Helder Martins <
>>>>>>>> helder.garay@corp.terra.com.br> wrote:
>>>>>>>> 
>>>>>>>>> Hi,
>>>>>>>>> I'm a dev working for a web portal in Brazil and I'm
>> particularly
>>>>>>>>> interested in building a item-based collaborative filtering
>>>>> recommender
>>>>>>>>> for our database of news articles.
>>>>>>>>> After some coding, I was able to get some recommendations
>> using a
>>>>>>>>> GenericItemBasedRecommender, a CassandraDataModel and
some
>> custom
>>>>>>>>> classes that store item similarities and migrated item
IDs into
>>>>>>>>> Cassandra. But know I'm in doubt of what is normally
done with
>>> this
>>>>>>>>> recommender: Should I run this as a daemon, cache the
>>>> recommendations
>>>>>>>>> into memory and set up a web service to consult it online?
>>> Should I
>>>>> pre
>>>>>>>>> process these recommendations for each recent user and
store it
>>>>>>>>> somewhere? My first idea was storing all these recs back
into
>>>>>> Cassandra,
>>>>>>>>> but looking into some classes it seems to me that the
norm is
>> to
>>>> read
>>>>>>>>> the input data and store the output always using files.
Is
>> this a
>>>>>> common
>>>>>>>>> practice that benefits from HDFS?
>>>>>>>>> My use case here is something around 70k recommendations
>> requests
>>>> per
>>>>>>>>> second.
>>>>>>>>> 
>>>>>>>>> Thanks in advance,
>>>>>>>>> 
>>>>>>>>> --
>>>>>>>>> 
>>>>>>>>> Atenciosamente
>>>>>>>>> Helder Martins
>>>>>>>>> Arquitetura do Portal e Sistemas de Backend
>>>>>>>>> +55 (51) 3284-4475
>>>>>>>>> Terra
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Esta mensagem e seus anexos se dirigem exclusivamente
ao seu
>>>>>>>> destinatário,
>>>>>>>>> podem conter informação privilegiada ou confidencial
e são de
>> uso
>>>>>>>> exclusivo
>>>>>>>>> da pessoa ou entidade de destino. Se não for destinatário
desta
>>>>>> mensagem,
>>>>>>>>> fica notificado de que a leitura, utilização, divulgação
e/ou
>>> cópia
>>>>> sem
>>>>>>>>> autorização pode estar proibida em virtude da legislação
>> vigente.
>>>> Se
>>>>>>>>> recebeu esta mensagem por engano, pedimos que nos o comunique
>>>>>>>> imediatamente
>>>>>>>>> por esta mesma via e, em seguida, apague-a.
>>>>>>>>> 
>>>>>>>>> Este mensaje y sus adjuntos se dirigen exclusivamente
a su
>>>>>> destinatario,
>>>>>>>>> puede contener información privilegiada o confidencial
y es
>> para
>>>> uso
>>>>>>>>> exclusivo de la persona o entidad de destino. Si no es
usted él
>>>>>>>>> destinatario indicado, queda notificado de que la lectura,
>>>>> utilización,
>>>>>>>>> divulgación y/o copia sin autorización puede estar
prohibida en
>>>>> virtud
>>>>>> de
>>>>>>>>> la legislación vigente. Si ha recibido este mensaje
por error,
>> le
>>>>>> pedimos
>>>>>>>>> que nos lo comunique inmediatamente por esta misma vía
y
>> proceda
>>> a
>>>> su
>>>>>>>>> exclusión.
>>>>>>>>> 
>>>>>>>>> The information contained in this transmissión is privileged
>> and
>>>>>>>>> confidential information intended only for the use of
the
>>>> individual
>>>>> or
>>>>>>>>> entity named above. If the reader of this message is
not the
>>>> intended
>>>>>>>>> recipient, you are hereby notified that any dissemination,
>>>>> distribution
>>>>>>>> or
>>>>>>>>> copying of this communication is strictly prohibited.
If you
>> have
>>>>>>>> received
>>>>>>>>> this transmission in error, do not read it. Please immediately
>>>> reply
>>>>> to
>>>>>>>> the
>>>>>>>>> sender that you have received this communication in error
and
>>> then
>>>>>> delete
>>>>>>>>> it.
>>>>>>>>> 
>>>>>>>> 
>>>>>> 
>>>>>> 
>>>>> 
>>>>> 
>>>>> --
>>>>> BF Lyon
>>>>> http://www.nowherenearithaca.com
>>>>> 
>>>> 
>>> 
>> 
>> 
>> 
>> --
>> Luis Carlos Guerrero Covo
>> M.S. Computer Engineering
>> (57) 3183542047
>> 

-- 
Manuel Blechschmidt
M.Sc. IT Systems Engineering
Dortustr. 57
14467 Potsdam
Mobil: 0173/6322621
Twitter: http://twitter.com/Manuel_B


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message