spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Spencer, Alex (Santander)" <Alex.Spen...@santander.co.uk.INVALID>
Subject BinaryClassificationMetrics - get raw tp/fp/tn/fn stats per threshold?
Date Fri, 02 Sep 2016 13:54:38 GMT
Hi,

BinaryClassificationMetrics expose recall and precision byThreshold. Is there a way to true
negatives / false negatives etc per threshold?

I have weighted my genuines and would like the adjusted precision / FPR. (Unless there is
an option that I've missed, although I have been over the Class twice now and can't see any
weighting options). I had to build my own, which seems a bit like reinventing the wheel (isn't
as safe + fast for a start):

val threshold_stats = metrics.thresholds.cartesian(predictionAndLabels).map{case (t, (prob,
label)) =>
  val selected = (prob >= t)
  val fraud = (label == 1.0)

  val tp = if (fraud && selected) 1 else 0
  val fp = if (!fraud && selected) 1 else 0
  val tn = if (!fraud && !selected) 1 else 0
  val fn = if (fraud && !selected) 1 else 0

  (t, (tp, fp, tn, fn))
}.reduceByKey((x, y) => (x._1 + y._1, x._2 + y._2, x._3 + y._3, x._4 + y._4))

Kind Regards,
Alex.

Mime
View raw message