mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sebastian Schelter <ssc.o...@googlemail.com>
Subject Re: Getting rating for all the files
Date Tue, 01 Oct 2013 06:11:57 GMT
There is no hadoop-based userbased recommender at the moment.

On 01.10.2013 01:13, Martin, Nick wrote:
> Hi all, 
> 
> I have the same question as Deepak does below...where can I find the User based recommender
via Mahout command line?
> 
> I don't see it listed in the valid program names:
> 
> Valid program names are:
>   arff.vector: : Generate Vectors from an ARFF file or directory
>   baumwelch: : Baum-Welch algorithm for unsupervised HMM training
>   canopy: : Canopy clustering
>   cat: : Print a file or resource as the logistic regression models would see it
>   cleansvd: : Cleanup and verification of SVD output
>   clusterdump: : Dump cluster output to text
>   clusterpp: : Groups Clustering Output In Clusters
>   cmdump: : Dump confusion matrix in HTML or text formats
>   cvb: : LDA via Collapsed Variation Bayes (0th deriv. approx)
>   cvb0_local: : LDA via Collapsed Variation Bayes, in memory locally.
>   dirichlet: : Dirichlet Clustering
>   eigencuts: : Eigencuts spectral clustering
>   evaluateFactorization: : compute RMSE and MAE of a rating matrix factorization against
probes
>   fkmeans: : Fuzzy K-means clustering
>   fpg: : Frequent Pattern Growth
>   hmmpredict: : Generate random sequence of observations by given HMM
>   itemsimilarity: : Compute the item-item-similarities for item-based collaborative filtering
>   kmeans: : K-means clustering
>   lucene.vector: : Generate Vectors from a Lucene index
>   matrixdump: : Dump matrix in CSV format
>   matrixmult: : Take the product of two matrices
>   meanshift: : Mean Shift clustering
>   minhash: : Run Minhash clustering
>   parallelALS: : ALS-WR factorization of a rating matrix
>   recommendfactorized: : Compute recommendations using the factorization of a rating
matrix
>   recommenditembased: : Compute recommendations using item-based collaborative filtering
>   regexconverter: : Convert text files on a per line basis based on regular expressions
>   rowid: : Map SequenceFile<Text,VectorWritable> to {SequenceFile<IntWritable,VectorWritable>,
SequenceFile<IntWritable,Text>}
>   rowsimilarity: : Compute the pairwise similarities of the rows of a matrix
>   runAdaptiveLogistic: : Score new production data using a probably trained and validated
AdaptivelogisticRegression model
>   runlogistic: : Run a logistic regression model against CSV data
>   seq2encoded: : Encoded Sparse Vector generation from Text sequence files
>   seq2sparse: : Sparse Vector generation from Text sequence files
>   seqdirectory: : Generate sequence files (of Text) from a directory
>   seqdumper: : Generic Sequence File dumper
>   seqmailarchives: : Creates SequenceFile from a directory containing gzipped mail archives
>   seqwiki: : Wikipedia xml dump to sequence file
>   spectralkmeans: : Spectral k-means clustering
>   split: : Split Input data into test and train sets
>   splitDataset: : split a rating dataset into training and probe parts
>   ssvd: : Stochastic SVD
>   svd: : Lanczos Singular Value Decomposition
>   testnb: : Test the Vector-based Bayes classifier
>   trainAdaptiveLogistic: : Train an AdaptivelogisticRegression model
>   trainlogistic: : Train a logistic regression using stochastic gradient descent
>   trainnb: : Train the Vector-based Bayes classifier
>   transpose: : Take the transpose of a matrix
>   validateAdaptiveLogistic: : Validate an AdaptivelogisticRegression model against hold-out
data set
>   vecdist: : Compute the distances between a set of Vectors (or Cluster or Canopy, they
must fit in memory) and a list of Vectors
>   vectordump: : Dump vectors from a sequence file to text
>   viterbi: : Viterbi decoding of hidden states from given output states sequence
> 
> -----Original Message-----
> From: Deepak Subhramanian [mailto:deepak.subhramanian@gmail.com] 
> Sent: Sunday, September 29, 2013 4:06 PM
> To: user@mahout.apache.org
> Subject: Re: Getting rating for all the files
> 
> I tried writing a UserRecommendation program in java. But it give me less results than
the ItemBasedRecommendation. Anyone else have any thoughts on my previous question ?
> 
> 
> On Sun, Sep 29, 2013 at 7:24 PM, Deepak Subhramanian < deepak.subhramanian@gmail.com>
wrote:
> 
>> Thanks Nick. I am planning to give a try with userbasedrecommendation 
>> since there are low no of users. I dont see recommenduserbased option 
>> in the commandline utility for Mahout. Does that mean I have to write 
>> a Java Program to use the UserBasedRecommender ?
>>
>>
>> On Sun, Sep 29, 2013 at 7:22 PM, Martin, Nick <NiMartin@pssd.com> wrote:
>>
>>> I'l need to defer to one of the other math whizzes on the potential 
>>> reasons for recommendations for certain users not appearing. My 
>>> suspicion is that you would either not have sufficient co-occurrence 
>>> of specific users/items to support a recommendation or you may need 
>>> to experiment with a different similarity measure.
>>>
>>> Anyone else want to weigh in?
>>>
>>>
>>>
>>> Sent from my iPhone
>>>
>>> On Sep 29, 2013, at 1:14 PM, "Deepak Subhramanian" < 
>>> deepak.subhramanian@gmail.com> wrote:
>>>
>>>> Sorry . My mistake . I am getting the lower ratings for some of the
>>> users
>>>> and items. But my issue is not solved . I am not getting ratings 
>>>> for
>>> some
>>>> of the users and some of the ratings.
>>>>
>>>> My userFile has 8000 users and my itemsFile has 4000 Items  . But I 
>>>> get recommendations for only 5000 users and  1500 items. And the 
>>>> maximum no
>>> of
>>>> recommendations given is 258. What can be the reasons that there  
>>>> is no items recommendations for 3000 users and 2500 items. Is it 
>>>> because
>>> there is
>>>> no similarities exist between those users and items  ?
>>>>
>>>>
>>>> On Sun, Sep 29, 2013 at 4:46 PM, Deepak Subhramanian < 
>>>> deepak.subhramanian@gmail.com> wrote:
>>>>
>>>>> Thanks Nick. As I mentioned earleir I am getting  ratings only for 
>>>>> the
>>> top
>>>>> recommended products instead of ratings for 4000 products I am 
>>>>> giving numRecommendations parameter to 4000 and maxPrefsPerUser  to 4000.
>>> Should
>>>>> it give 4000 items in the list for each user ? For some reasons 
>>>>> the output for items which are having lower ratings is not 
>>>>> displayed.  I
>>> see
>>>>> the default limit is 10.
>>>>>
>>>>> I am not sure if I am not getting ratings for 4000 items because I 
>>>>> am passing the wrong options for the  mahout version or is it an 
>>>>> issue
>>> with
>>>>> mahout ver 0.7. I am using 0.7 -mahout-examples-0.7-cdh4.3.1.jar .
>>>>>
>>>>> I see the parameter name changed in the latest version I checked 
>>>>> from
>>> git
>>>>> - 0.9-SNAPSHOT
>>>>>
>>>>> maxPrefsPerUserConsidered =
>>> jobConf.getInt(MAX_PREFS_PER_USER_CONSIDERED,
>>>>> DEFAULT_MAX_PREFS_PER_USER_CONSIDERED);
>>>>>
>>>>> Will using a latest version help ?
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Sun, Sep 29, 2013 at 12:29 PM, Martin, Nick <NiMartin@pssd.com>
>>> wrote:
>>>>>
>>>>>> There should be a score after each recommended item (i.e. 
>>>>>> 123456:2.6)
>>> in
>>>>>> your output. Lower scores would be the ones you're interested in.
>>>>>>
>>>>>> Sent from my iPhone
>>>>>>
>>>>>> On Sep 28, 2013, at 8:25 AM, "Deepak Subhramanian" < 
>>>>>> deepak.subhramanian@gmail.com> wrote:
>>>>>>
>>>>>>> Hi
>>>>>>>
>>>>>>> I am trying to predict the ratings for some items for some users
>>> using
>>>>>> item
>>>>>>> based collaborative filtering. I tried using the mahout
>>>>>> recommenditembased
>>>>>>> , but it shows only the top 10 items or I can increase it by

>>>>>>> passing
>>> the
>>>>>>> --numRecommendations parameter. But it doesnt shows items which

>>>>>>> has
>>>>>> lower
>>>>>>> predicted rating . What is the best approach to get ratings for

>>>>>>> items
>>>>>> which
>>>>>>> has low predicted rating ?
>>>>>>>
>>>>>>>
>>>>>>> I tried this command.
>>>>>>>
>>>>>>> mahout recommenditembased --input mahoutrecoinput --usersFile

>>>>>>> recouserlist  --itemsFile  recoitemlist --output 
>>>>>>> /mahoutrecooutputpearsonnew -s SIMILARITY_PEARSON_CORRELATION

>>>>>>> --numRecommendations 4000  --maxPrefsPerUser 4000
>>>>>>>
>>>>>>> Also I tried using the estimatePreference method on the recommender.
>>>>>>>
>>>>>>> Please help .
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Deepak Subhramanian
>>>>
>>>>
>>>>
>>>> --
>>>> Deepak Subhramanian
>>>
>>
>>
>>
>> --
>> Deepak Subhramanian
>>
> 
> 
> 
> --
> Deepak Subhramanian
> 


Mime
View raw message