spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Liquan Pei <>
Subject Re: return probability \ confidence instead of actual class
Date Mon, 22 Sep 2014 06:50:49 GMT
HI Adamantios,

For your first question, after you train the SVM, you get a model with a
vector of weights w and an intercept b, point x such that + b = 1
and + b = -1 are points that on the decision boundary. The
quantity + b for point x is a confidence measure of

Code wise, suppose you trained your model via
val model = SVMWithSGD.train(...)

and you can set a threshold by calling

model.setThreshold(your threshold here)

to set the threshold that separate positive predictions from negative

For more info, please take a look at

For your second question, SVMWithSGD only supports binary classification.

Hope this helps,


On Sun, Sep 21, 2014 at 11:22 PM, Adamantios Corais <> wrote:

> Nobody?
> If that's not supported already, can please, at least, give me a few hints
> on how to implement it?
> Thanks!
> On Fri, Sep 19, 2014 at 7:43 PM, Adamantios Corais <
>> wrote:
>> Hi,
>> I am working with the SVMWithSGD classification algorithm on Spark. It
>> works fine for me, however, I would like to recognize the instances that
>> are classified with a high confidence from those with a low one. How do we
>> define the threshold here? Ultimately, I want to keep only those for which
>> the algorithm is very *very* certain about its its decision! How to do
>> that? Is this feature supported already by any MLlib algorithm? What if I
>> had multiple categories?
>> Any input is highly appreciated!

Liquan Pei
Department of Physics
University of Massachusetts Amherst

View raw message