spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Koert Kuipers <>
Subject spark 2.4.1 -> 3.0.0-SNAPSHOT mllib
Date Tue, 23 Apr 2019 22:38:03 GMT
 we recently started compiling against spark 3.0.0-SNAPSHOT (build inhouse
from master branch) to uncover any breaking changes that might be an issue
for us.

we ran into some of our tests breaking where we use mllib. most of it is
immaterial: we had some magic numbers hard-coded and the results are
slightly different because spark changed its random number generation or
because spark fixed a genuine bug in a classifier, etc.

however we see somewhat significant changes in ALS factors and also in
resulting recommendations. all this while there seems to be no changes in
the ALS code between spark 2.4.1 and current master.

we cannot come up with a good explanation so far. any idea what is going on?

View raw message