spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matei Zaharia <matei.zaha...@gmail.com>
Subject Re: possible bug in Spark's ALS implementation...
Date Sun, 16 Mar 2014 20:18:39 GMT
On Mar 14, 2014, at 5:52 PM, Michael Allman <msa@allman.ms> wrote:

> I also found that the product and user RDDs were being rebuilt many times
> over in my tests, even for tiny data sets. By persisting the RDD returned
> from updateFeatures() I was able to avoid a raft of duplicate computations.
> Is there a reason not to do this?

This sounds like a good thing to add, though I’d like to understand why these are being
recomputed (it seemed that the code would only use each one once). Do you have any sense why
that is?

Matei
Mime
View raw message