spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ilya Matiach <>
Subject mllib metrics vs ml evaluators and how to improve apis for users
Date Fri, 30 Dec 2016 04:36:55 GMT
Hi ML/MLLib developers,
1.    I'm trying to add a weights column to ml spark evaluators (RegressionEvaluator, BinaryClassificationEvaluator,
MutliclassClassificationEvaluator) that use mllib metrics and I have a few questions (JIRA
2.    SPARK-18693<>).  I didn't see
any similar question on the forums or stackoverflow.
Moving forward, will we keep mllib metrics (RegressionMetrics, MulticlassMetrics, BinaryClassificationMetrics)
as something separate to the evaluators, or will we remove them when mllib is removed in spark
The mllib metrics seem very useful because they are able to compute/expose many metrics on
one dataset, whereas with the evaluators it is not performant to re-evaluate the entire dataset
for a single different metric.
For example, if I calculate the RMSE and then MSE using the ML RegressionEvaluator, I will
be redoing most of the work twice, so the ML api doesn't make sense to use in this scenario.
Also, the ml evaluators expose a lot fewer metrics than the mllib metrics classes, so it seems
like the ml evaluators are not at parity with the mllib metrics classes.
I can see how the ml evaluators are useful in CrossValidator, but for exploring all metrics
from a scored dataset it doesn't really make sense to use them.
>From the viewpoint of exploring all metrics for a scored model, does this mean that the
mllib metrics classes should be moved to ml?
That would solve my issue if that is what is planned in the future.  However, that doesn't
make sense to me, because it may cause some confusion for ml users to see metrics and evaluators

Instead, it seems like the ml evaluators need to be changed at the api layer to:

  1.  Allow the user to either retrieve a single value
  2.  Allow the user to retrieve all metrics or a set of metrics
One possibility would be to overload evaluate so that we would have something like:

override def evaluate(dataset: Dataset[_]): Double
override def evaluate(dataset: Dataset[_], metrics:Array[String]): Dataset[_]

But for some metrics like confusion matrix you couldn't really fit the data into the result
of the second api in addition to the single-value metrics.
The format of the mllib metrics classes was much more convenient, as you could retrieve them
Following this line of thought, maybe the APIs could be:

override def evaluate(dataset: Dataset[_]): Double
def evaluateMetrics(dataset: Dataset[_]): RegressionEvaluation (or classification/multiclass

where the evaluation class returned will have very similar fields to the corresponding mllib
RegressionMetrics class that can be called by the user.

Any thoughts/ideas about spark ml evaluators/mllib metrics apis, coding suggestions for the
api proposed, or a general roadmap would be really appreciated.

Thank you, Ilya

View raw message