spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Hollin Wilkins <>
Subject Re: Question about Multinomial LogisticRegression in spark mllib in spark 2.1.0
Date Wed, 01 Feb 2017 23:56:33 GMT
Hey Aseem,

If you are looking for a full-featured library to execute Spark ML
pipelines outside of Spark, take a look at MLeap:

Not only does it support transforming single instances of a feature vector,
but you can execute your entire ML pipeline including feature extraction.


On Wed, Feb 1, 2017 at 8:49 AM, Seth Hendrickson <> wrote:

> In Spark.ML the coefficients are not "pivoted" meaning that they do not
> set one of the coefficient sets equal to zero. You can read more about it
> here:
> regression#As_a_set_of_independent_binary_regressions
> You can translate your set of coefficients to a pivoted version by simply
> subtracting one of the sets of coefficients from all the others. That
> leaves the one you selected, the "pivot", as all zeros. You can then pass
> this into the mllib model, disregarding the "pivot" coefficients. The
> coefficients should be laid out like:
> [feature0_class0, feature1_class0, feature2_class0, intercept0,
> feature0_class1, ..., intercept1]
> So you have 9 coefficients and 3 intercepts, but you are going to get rid
> of one class's coefficients, leaving you with 6 coefficients and two
> intercepts - so a vector of length 8 for mllib's model.
> Note: if you use regularization then it is not exactly correct to convert
> from the non-pivoted version to the pivoted one, since the algorithms will
> give different results in those cases, though it is still possible to do it.
> On Wed, Feb 1, 2017 at 3:42 AM, Aseem Bansal <> wrote:
>> *What I want to do*
>> I have a trained a ml.classification.LogisticRegressionModel using spark
>> ml package.
>> It has 3 features and 3 classes. So the generated model has coefficients
>> in (3, 3) matrix and intercepts in Vector of length (3) as expected.
>> Now, I want to take these coefficients and convert this
>> ml.classification.LogisticRegressionModel model to an instance of
>> mllib.classification.LogisticRegressionModel model.
>> *Why I want to do this*
>> Computational Speed as SPARK-10413 is still in progress and scheduled for
>> Spark 2.2 which is not yet released.
>> *Why I think this is possible*
>> I checked
>> thods.html#logistic-regression and in that example a multinomial
>> Logistic Regression is trained. So as per this the class
>> mllib.classification.LogisticRegressionModel can encapsulate these
>> parameters.
>> *Problem faced*
>> The only constructor in mllib.classification.LogisticRegressionModel
>> takes a single vector as coefficients and single double as intercept but I
>> have a Matrix of coefficients and Vector of intercepts respectively.
>> I tried converting matrix to a vector by just taking the values (Guess
>> work) but got
>> requirement failed: LogisticRegressionModel.load with numClasses = 3 and
>> numFeatures = 3 expected weights of length 6 (without intercept) or 8 (with
>> intercept), but was given weights of length 9
>> So any ideas?

View raw message