spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sean Owen <so...@cloudera.com>
Subject Re: MLlib - Naive Bayes Java example bug
Date Mon, 03 Nov 2014 19:29:24 GMT
Yes, good catch. I also think the "1.0 *" is suboptimal as a cast to
double. I searched for similar issues and didn't see any. Open a PR --
I'm not even sure this is enough to warrant a JIRA? but feel free to
as well.

On Mon, Nov 3, 2014 at 6:46 PM, Dariusz Kobylarz
<darek.kobylarz@gmail.com> wrote:
> Hi,
> I noticed a bug in the sample java code in MLlib - Naive Bayes docs page:
> http://spark.apache.org/docs/1.1.0/mllib-naive-bayes.html
>
> In the filter:
>
> double accuracy = 1.0 * predictionAndLabel.filter(new
> Function<Tuple2<Double, Double>, Boolean>() {
>     @Override public Boolean call(Tuple2<Double, Double> pl) {
>       return pl._1() == pl._2();
>     }
>   }).count() / test.count();
>
> it tests Double object by references whereas it should test their values:
>
> double accuracy = 1.0 * predictionAndLabel.filter(new
> Function<Tuple2<Double, Double>, Boolean>() {
>     @Override public Boolean call(Tuple2<Double, Double> pl) {
>       return pl._1().doubleValue() == pl._2().doubleValue();
>     }
>   }).count() / test.count();
>
> The Java version accuracy is always 0.0. Scala code outputs the correct
> value 1.0
>
> Thanks,
>
>
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org


Mime
View raw message