spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Franco Victorio <>
Subject Semi-supervised learning in MLlib
Date Sat, 27 Jan 2018 17:29:03 GMT
Hi, I'm working on the implementation of a semi-supervised algorithm in Spark
and I want it to implement the interfaces provided by MLlib, so that it can
use things like model selection.

My problem is that, as far as I can tell, the provided interfaces are meant
for supervised algorithms (for example, they assume all the training data is

The other problem is that this method is transductive, so it would receive a
dataframe with features and label columns, and the label column would be
mostly null, and the algorithm would just fill the non-null entries. What I
mean with this is that a `fit` stage doesn't really make sense. But if I
want to do model selection, I need to have an Estimator with configurable

Is anyone aware of some work already done in Spark with this
characteristics? Are there plans to support this kind of algorithms in the


Sent from:

To unsubscribe e-mail:

View raw message