mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From jamborta <jambo...@gmail.com>
Subject FileDataModel - taste library
Date Tue, 15 Dec 2009 14:39:59 GMT

hi,

I found something very weird, I can't figure out what's wrong.
I use this FileDataModel to read the dataset from disk:

DataModel model = new FileDataModel(new File("./data/all_data.data"));
int numUsers = model.getNumUsers();

on one machine it works like this:

15-Dec-2009 14:29:32 org.slf4j.impl.JCLLoggerAdapter info
INFO: Creating FileDataModel for file .\data\all_data.data
15-Dec-2009 14:29:32 org.slf4j.impl.JCLLoggerAdapter info
INFO: Reading file info...
15-Dec-2009 14:29:32 org.slf4j.impl.JCLLoggerAdapter info
INFO: Read lines: 100000
15-Dec-2009 14:29:32 org.slf4j.impl.JCLLoggerAdapter info
INFO: Processed 943 users
15-Dec-2009 14:29:33 org.slf4j.impl.JCLLoggerAdapter info
INFO: Processed 943 users

which is correct.
on another one it seems to read something else at the same time. it gives me
this output:

15-Dec-2009 14:35:13 org.slf4j.impl.JCLLoggerAdapter info
INFO: Creating FileDataModel for file .\data\all_data.data
15-Dec-2009 14:35:13 org.slf4j.impl.JCLLoggerAdapter info
INFO: Reading file info...
15-Dec-2009 14:35:13 org.slf4j.impl.JCLLoggerAdapter info
INFO: Read lines: 100000
15-Dec-2009 14:35:13 org.slf4j.impl.JCLLoggerAdapter info
INFO: Reading file info...
15-Dec-2009 14:35:15 org.slf4j.impl.JCLLoggerAdapter info
INFO: Processed 1000000 lines
15-Dec-2009 14:35:15 org.slf4j.impl.JCLLoggerAdapter info
INFO: Read lines: 1000209
15-Dec-2009 14:35:17 org.slf4j.impl.JCLLoggerAdapter info
INFO: Processed 6040 users

I have two datasets, but for some reason on the second machine it rereads it
from somewhere.

thanks a lot
-- 
View this message in context: http://old.nabble.com/FileDataModel---taste-library-tp26795792p26795792.html
Sent from the Mahout User List mailing list archive at Nabble.com.


Mime
View raw message