spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sean Owen <sro...@gmail.com>
Subject Re: How to specify “positive class” in sparkml classification?
Date Thu, 08 Jul 2021 00:58:37 GMT
The positive class is "1" and negative is "0" by convention; I don't think
you can change that (though you can translate your data if needed).
F1 is defined only in a one-vs-rest sense in multi-class evaluation. You
can set 'metricLabel' to define which class is 'positive' in multiclass -
everything else is 'negative'.

On Wed, Jul 7, 2021 at 7:19 PM Reed Villanueva <villanuevareed@gmail.com>
wrote:

> How to specify the "positive class" in sparkml binary classification? (Or
> perhaps: How does a MulticlassClassificationEvaluator
> <https://spark.apache.org/docs/latest/api/python/reference/api/pyspark.ml.evaluation.MulticlassClassificationEvaluator.html?highlight=multiclassclassificationevaluator>
determine
> which class is the "positive" one when evaluating for, say, F1 or even just
> Recall?)
> I have a Pipeline like...
>
> pipeline = Pipeline(stages=[label_idxer, feature_idxer, onehotencoder, assembler, my_ml_algo,
label_converter])
>
> crossval = CrossValidator(estimator=pipeline,
>                           evaluator=MulticlassClassificationEvaluator(
>                               labelCol=my_ml_algo.getLabelCol(),
>                               predictionCol=my_ml_algo.getPredictionCol(),
>                               metricName="f1"),
>                           numFolds=3)
>
> Is there a way to specify which label or index is the positive/negative
> class?
>

Mime
View raw message