spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Maciej Szymkiewicz <mszymkiew...@gmail.com>
Subject [MLLIB] RankingMetrics.precisionAt
Date Mon, 05 Dec 2016 23:30:02 GMT
Hi,

Could I ask fora fresh pair of eyes on this piece of code:

https://github.com/apache/spark/blob/f830bb9170f6b853565d9dd30ca7418b93a54fe3/mllib/src/main/scala/org/apache/spark/mllib/evaluation/RankingMetrics.scala#L59-L80

  @Since("1.2.0")
  def precisionAt(k: Int): Double = {
    require(k > 0, "ranking position k should be positive")
    predictionAndLabels.map { case (pred, lab) =>
      val labSet = lab.toSet

      if (labSet.nonEmpty) {
        val n = math.min(pred.length, k)
        var i = 0
        var cnt = 0
        while (i < n) {
          if (labSet.contains(pred(i))) {
            cnt += 1
          }
          i += 1
        }
        cnt.toDouble / k
      } else {
        logWarning("Empty ground truth set, check input data")
        0.0
      }
    }.mean()
  }


Am I the only one who thinks this doesn't do what it claims? Just for
reference:

  * https://web.archive.org/web/20120415101144/http://sas.uwaterloo.ca/stats_navigation/techreports/04WorkingPapers/2004-09.pdf
  * https://github.com/benhamner/Metrics/blob/master/Python/ml_metrics/average_precision.py

-- 
Best,
Maciej


Mime
View raw message