spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Petr Shestov <pshes...@nvidia.com>
Subject Proper saving/loading of MatrixFactorizationModel
Date Mon, 20 Jul 2015 10:26:00 GMT
Hi all!
I have MatrixFactorizationModel object. If I'm trying to recommend products to single user
right after constructing model through ALS.train(...) then it takes 300ms (for my data and
hardware). But if I save model to disk and load it back then recommendation takes almost 2000ms.
Also Spark warns:
15/07/17 11:05:47 WARN MatrixFactorizationModel: User factor does not have a partitioner.
Prediction on individual records could be slow.
15/07/17 11:05:47 WARN MatrixFactorizationModel: User factor is not cached. Prediction could
be slow.
15/07/17 11:05:47 WARN MatrixFactorizationModel: Product factor does not have a partitioner.
Prediction on individual records could be slow.
15/07/17 11:05:47 WARN MatrixFactorizationModel: Product factor is not cached. Prediction
could be slow.
How can I create/set partitioner and cache user and product factors after loading model? Following
approach didn't help:
model.userFeatures().cache();
model.productFeatures().cache();
Also I was trying to repartition those rdds and create new model from repartitioned versions
but that also didn't help.


-----------------------------------------------------------------------------------
This email message is for the sole use of the intended recipient(s) and may contain
confidential information.  Any unauthorized review, use, disclosure or distribution
is prohibited.  If you are not the intended recipient, please contact the sender by
reply email and destroy all copies of the original message.
-----------------------------------------------------------------------------------

Mime
View raw message