From user-return-18790-apmail-mahout-user-archive=mahout.apache.org@mahout.apache.org Sat Nov 23 12:44:04 2013 Return-Path: X-Original-To: apmail-mahout-user-archive@www.apache.org Delivered-To: apmail-mahout-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 2790810AE4 for ; Sat, 23 Nov 2013 12:44:03 +0000 (UTC) Received: (qmail 73207 invoked by uid 500); 23 Nov 2013 12:44:00 -0000 Delivered-To: apmail-mahout-user-archive@mahout.apache.org Received: (qmail 73174 invoked by uid 500); 23 Nov 2013 12:43:59 -0000 Mailing-List: contact user-help@mahout.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@mahout.apache.org Delivered-To: mailing list user@mahout.apache.org Received: (qmail 73166 invoked by uid 99); 23 Nov 2013 12:43:57 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 23 Nov 2013 12:43:57 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of pavan.narayanan@gmail.com designates 209.85.217.177 as permitted sender) Received: from [209.85.217.177] (HELO mail-lb0-f177.google.com) (209.85.217.177) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 23 Nov 2013 12:43:53 +0000 Received: by mail-lb0-f177.google.com with SMTP id w7so1833710lbi.36 for ; Sat, 23 Nov 2013 04:43:32 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type; bh=Q4YKrMjJET02t06n4RcWi+1uTgsDhuuBce78Fj0l83A=; b=wjwIqJEvyLcQe35oLc42O8MXMTlkJirXnsaV5vKsXtgquki4p3NLj5Ff57hCfdMeUy 5qmWAX1xqXDydBaPLN/Gv5o/PJQB2K9mMY+tPcS9md0phQRprFJJkGPHYUpmILaHmDcp +lZ6oJTbZ8JgyF/0MxEfgv4bTV7z6G+VWnMPNrfzM1d8O/p61c6VIkyAQoDoBVHu1pO5 GU+nI79K+fE3cNASWHy5GtuzQeqvxxZI2IODfX2yDk8Q1ypa+fr71dBHll+6jTZtpwfE 67NK1IoSN36NH81/55NWiugbhwN5CpfU3VAm7d1I52UPLdi/hvogurHCwd3czr+iibkn W+ng== X-Received: by 10.152.143.101 with SMTP id sd5mr1544408lab.26.1385210611891; Sat, 23 Nov 2013 04:43:31 -0800 (PST) MIME-Version: 1.0 Received: by 10.114.18.51 with HTTP; Sat, 23 Nov 2013 04:43:11 -0800 (PST) In-Reply-To: <528FE8C3.80504@googlemail.com> References: <528FE8C3.80504@googlemail.com> From: Pavan K Narayanan Date: Sat, 23 Nov 2013 18:13:11 +0530 Message-ID: Subject: Re: HELP for implicit data feed back - beginner To: user@mahout.apache.org Content-Type: multipart/alternative; boundary=001a113458547600a704ebd77aae X-Virus-Checked: Checked by ClamAV on apache.org --001a113458547600a704ebd77aae Content-Type: text/plain; charset=ISO-8859-1 Hi Sebastian Pardon my ignorance but how do you suggest we use this o.a.m.cf.taste.impl. recommender.GenericBooleanPrefItemBasedRecommender? Can we use it by coding in Java? - if yes, do we need Java EE? Is there a Mahout perspective for Eclipse IDE? Is it possible to use these in Mahout CLI? There are mentions of java programs in MiA but I am unsure how to setup Mahout in Java . Please can you clarify this part . Sincerely, Pavan On 23 November 2013 04:59, Sebastian Schelter wrote: > Antony, > > You don't need numeric ratings or preferences for your recommender. I > would suggest you start by using > > o.a.m.cf.taste.impl.recommender.GenericBooleanPrefItemBasedRecommender > > which has explicitly been built to support scenarios without ratings. I > would further suggest to use > > o.a.m.cf.taste.impl.similarity.LogLikelihoodSimilarity > > as similarity measure. > > Best, > Sebastian > > > On 22.11.2013 22:37, Antony Adopo wrote: > > ok, thank you so much. I will start like this and after do some tricks to > > increase accuracy > > > > > > 2013/11/22 Manuel Blechschmidt > > > >> Hallo Antony, > >> you can use the following project as a starting point: > >> https://github.com/ManuelB/facebook-recommender-demo > >> > >> Further you can purchase support for mahout at many companies e.g. MapR, > >> Apaxo or Cloudera. > >> > >> For implicit feedback just use a 1 as preference and the > >> LogLikelihoodSimilarity. > >> > >> Hope that helps > >> Manuel > >> > >> On 22.11.2013, at 16:22, Antony Adopo wrote: > >> > >>> thanks. > >>> I've already seen this but my question is Mahout propose some > >> collaborative > >>> filtering function not based on preference? or how modelize these with > >>> purchases? > >>> > >>> Thanks > >>> > >>> > >>> 2013/11/22 Smith, Dan > >>> > >>>> Hi Anthony, > >>>> > >>>> I would suggest looking into the collaborative filtering functions. > It > >>>> will work best if you have your customers segmented into similar > groups > >>>> such as those that buy high end goods vs low end. > >>>> > >>>> _Dan > >>>> > >>>> On 11/22/13 11:04 AM, "Antony Adopo" wrote: > >>>> > >>>>> Ok. thanks for answering very quickly > >>>>> > >>>>> I forgot that to mention in the customer table there is a "job" > >> variable > >>>>> and implicitly, I thought taht this variable will be also need for > >>>>> accurate > >>>>> recommendations. anyway > >>>>> > >>>>> I have around 200 000 customers > >>>>> My order table is around 12 000 000 orders > >>>>> and I have around 2 000 000 distincts (customerid,itemid) tuples > >>>>> About (customerID,itemID) tuples, when I read Mahout or recommender > >>>>> system > >>>>> litterature, they use > >>>>> (customerID,itemID,*preference*) and I don't have *preference.* > >>>>> So exist an Mahout method or class that handle only > (customerID,itemID) > >>>>> data? > >>>>> And it is possible to use external data as job or (RFM ) analysis to > >> get > >>>>> something more accurate? > >>>>> > >>>>> Sorry (it's about 2 weeks, I have headache how organize all of this > to > >>>>> build a great system). Propose your solutions and after, we'll see > >>>>> > >>>>> > >>>>> > >>>>> about > >>>>> > >>>>> > >>>>> 2013/11/22 Sebastian Schelter > >>>>> > >>>>>> Hi Antony, > >>>>>> > >>>>>> I would start with a simple approach: extract all customerID,itemID > >>>>>> tuples from the orders table and use them as your input data. How > many > >>>>>> of those do you have? The datasize will dictate whether you need to > >>>>>> employ a distributed approach to recommendation mining or not. > >>>>>> > >>>>>> --sebastian > >>>>>> > >>>>>> On 22.11.2013 19:21, Antony Adopo wrote: > >>>>>>> Morning, > >>>>>>> > >>>>>>> My name is Antony and I have a great recommender system to build > >>>>>>> > >>>>>>> I'm totally new on recommender systems. After reading all > scientific > >>>>>> files, > >>>>>>> I didn't find relevant information to build mine. > >>>>>>> > >>>>>>> ok, my problem: > >>>>>>> > >>>>>>> I have to build a recommender systems for a retail industry which > >> sold > >>>>>>> Building products > >>>>>>> > >>>>>>> I don't have Explicit data (ratings) > >>>>>>> > >>>>>>> I have only data about purchases and all transactions and order and > >>>>>> dates. > >>>>>>> as > >>>>>>> > >>>>>>> Orders table > >>>>>>> > >>>>>>> CustomerID > >>>>>>> Sales_ID > >>>>>>> Item_ID > >>>>>>> Dates > >>>>>>> Amount > >>>>>>> quantity > >>>>>>> channel_type (phone, mail,etc.) > >>>>>>> > >>>>>>> > >>>>>>> I have also specific informations about users > >>>>>>> > >>>>>>> Users table > >>>>>>> CustomerID > >>>>>>> Group (engaged, frequent,buyer, newyer, etc.) > >>>>>>> > >>>>>>> ... and product > >>>>>>> > >>>>>>> Item_ID > >>>>>>> Item_name > >>>>>>> Iteem_parent (hierarchy) > >>>>>>> > >>>>>>> I don't know how to use all these informations with mahout (or > others > >>>>>> tools > >>>>>>> or method) to do a good recommendation system (all presents are > based > >>>>>> on > >>>>>>> ratings and all mahout systems I have seen are also based on > ratings > >>>>>> or > >>>>>>> preference) > >>>>>>> > >>>>>>> At beginning, I thought that I have to use classical datamining > >>>>>> methods > >>>>>> as > >>>>>>> Clustering or association rules but accurately recommanding n > >> products > >>>>>>> between 2000 products clustering in about 300 hierachical > >>>>>> parents(not > >>>>>>> linked to domain) become difficult with classical data mining > >>>>>>> It is the reason that I turn myself to recommender system > >>>>>>> > >>>>>>> > >>>>>>> please Help > >>>>>>> thanks > >>>>>>> > >>>>>> > >>>>>> > >>>> > >>>> > >> > >> -- > >> Manuel Blechschmidt > >> M.Sc. IT Systems Engineering > >> Dortustr. 57 > >> 14467 Potsdam > >> Mobil: 0173/6322621 > >> Twitter: http://twitter.com/Manuel_B > >> > >> > > > > --001a113458547600a704ebd77aae--