spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sean Owen <so...@cloudera.com>
Subject Re: [MLLIB] RankingMetrics.precisionAt
Date Tue, 06 Dec 2016 01:45:19 GMT
I read it again and that looks like it implements mean precision@k as I
would expect. What is the issue?

On Tue, Dec 6, 2016, 07:30 Maciej Szymkiewicz <mszymkiewicz@gmail.com>
wrote:

> Hi,
>
> Could I ask for a fresh pair of eyes on this piece of code:
>
>
> https://github.com/apache/spark/blob/f830bb9170f6b853565d9dd30ca7418b93a54fe3/mllib/src/main/scala/org/apache/spark/mllib/evaluation/RankingMetrics.scala#L59-L80
>
>   @Since("1.2.0")
>   def precisionAt(k: Int): Double = {
>     require(k > 0, "ranking position k should be positive")
>     predictionAndLabels.map { case (pred, lab) =>
>       val labSet = lab.toSet
>
>       if (labSet.nonEmpty) {
>         val n = math.min(pred.length, k)
>         var i = 0
>         var cnt = 0
>         while (i < n) {
>           if (labSet.contains(pred(i))) {
>             cnt += 1
>           }
>           i += 1
>         }
>         cnt.toDouble / k
>       } else {
>         logWarning("Empty ground truth set, check input data")
>         0.0
>       }
>     }.mean()
>   }
>
>
> Am I the only one who thinks this doesn't do what it claims? Just for
> reference:
>
>
>    -
>    https://web.archive.org/web/20120415101144/http://sas.uwaterloo.ca/stats_navigation/techreports/04WorkingPapers/2004-09.pdf
>    -
>    https://github.com/benhamner/Metrics/blob/master/Python/ml_metrics/average_precision.py
>
> --
> Best,
> Maciej
>
>

Mime
View raw message