mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sean Owen <sro...@gmail.com>
Subject Re: Persistent Data Model
Date Tue, 15 May 2012 10:16:04 GMT
It's done by you writing whatever code you want to read whatever thing
you want to make ItemItemSimilarity objects.

But if you're starting from a file, save yourself the time and use
FileDataModel.

On Tue, May 15, 2012 at 10:58 AM, Nikolaos Romanos Katsipoulakis
<popanik@gmail.com> wrote:
> On 05/14/2012 05:09 PM, Sean Owen wrote:
>>
>> Yes, you don't want to use the database directly. A relational
>> database will never be fast enough.
>>
>> You want to use RefreshFromJDBCDataModel to load the data into memory
>> periodically. This is really what I was referring to.
>>
>> On Mon, May 14, 2012 at 1:49 PM, Nikolaos Romanos Katsipoulakis
>> <popanik@gmail.com>  wrote:
>>>
>>> Thank you for your immediate response. Also, I have one more question.
>>> When
>>> I use data from a Mysql database, my recommender is very slow, about 2
>>> minutes for a 1 million records dataset. The code of the recommender is
>>> presented below:
>>>
>>> this.dataSource = new MysqlDataSource();
>>>
>>> this.dataSource.setServerName(serverName);
>>> this.dataSource.setUser(user);
>>> this.dataSource.setPassword(pass);
>>> this.dataSource.setDatabaseName(dbName);
>>>
>>> this.dataModel = new MySQLJDBCDataModel(this.dataSource, this.tableName,
>>>                this.userColumn, this.itemColumn, this.prefColumn,
>>>                this.timeStampColumn);
>>>
>>> this.similarity = new EuclideanDistanceSimilarity(this.dataModel);
>>> this.neighborhood = new NearestNUserNeighborhood(neighborhoodSize,
>>>                    this.similarity, this.dataModel);
>>>
>>> this.recommender = new GenericUserBasedRecommender(this.dataModel,
>>>                this.neighborhood, this.similarity);
>>> List<RecommendedItem>  recommendations = null;
>>>
>>> recommendations = recommender.recommend(user, NumOfRecommendations);
>>>
>>> Why is it so slow? I should mention that i also used
>>> MysqlConnectionPoolDataSource() and the performance remains the same.
>>>
> Ok. Also, I have one more question. While I was reading the
> GenericItemSimilarity javadoc, I came across this:
>
> A "generic" |GenericItemSimilarity.ItemItemSimilarity|
> <https://builds.apache.org/job/Mahout-Quality/javadoc/org/apache/mahout/cf/taste/impl/similarity/GenericItemSimilarity.ItemItemSimilarity.html>
> which takes a static list of precomputed item similarities and bases its
> responses on that alone. The values may have been precomputed offline by
> another process, *stored in a file*, and then read and fed into an instance
> of this class.
>
> How is this done? This is what I actually needed from the beginning.
>
> Thank you.

Mime
View raw message