Hey Aseem,
If you are looking for a fullfeatured library to execute Spark ML
pipelines outside of Spark, take a look at MLeap:
https://github.com/combust/mleap
Not only does it support transforming single instances of a feature vector,
but you can execute your entire ML pipeline including feature extraction.
Cheers,
Hollin
On Wed, Feb 1, 2017 at 8:49 AM, Seth Hendrickson <
seth.hendrickson16@gmail.com> wrote:
> In Spark.ML the coefficients are not "pivoted" meaning that they do not
> set one of the coefficient sets equal to zero. You can read more about it
> here: https://en.wikipedia.org/wiki/Multinomial_logistic_
> regression#As_a_set_of_independent_binary_regressions
>
> You can translate your set of coefficients to a pivoted version by simply
> subtracting one of the sets of coefficients from all the others. That
> leaves the one you selected, the "pivot", as all zeros. You can then pass
> this into the mllib model, disregarding the "pivot" coefficients. The
> coefficients should be laid out like:
>
> [feature0_class0, feature1_class0, feature2_class0, intercept0,
> feature0_class1, ..., intercept1]
>
> So you have 9 coefficients and 3 intercepts, but you are going to get rid
> of one class's coefficients, leaving you with 6 coefficients and two
> intercepts  so a vector of length 8 for mllib's model.
>
> Note: if you use regularization then it is not exactly correct to convert
> from the nonpivoted version to the pivoted one, since the algorithms will
> give different results in those cases, though it is still possible to do it.
>
> On Wed, Feb 1, 2017 at 3:42 AM, Aseem Bansal <asmbansal2@gmail.com> wrote:
>
>> *What I want to do*
>> I have a trained a ml.classification.LogisticRegressionModel using spark
>> ml package.
>>
>> It has 3 features and 3 classes. So the generated model has coefficients
>> in (3, 3) matrix and intercepts in Vector of length (3) as expected.
>>
>> Now, I want to take these coefficients and convert this
>> ml.classification.LogisticRegressionModel model to an instance of
>> mllib.classification.LogisticRegressionModel model.
>>
>> *Why I want to do this*
>> Computational Speed as SPARK10413 is still in progress and scheduled for
>> Spark 2.2 which is not yet released.
>>
>> *Why I think this is possible*
>> I checked https://spark.apache.org/docs/latest/mlliblinearme
>> thods.html#logisticregression and in that example a multinomial
>> Logistic Regression is trained. So as per this the class
>> mllib.classification.LogisticRegressionModel can encapsulate these
>> parameters.
>>
>> *Problem faced*
>> The only constructor in mllib.classification.LogisticRegressionModel
>> takes a single vector as coefficients and single double as intercept but I
>> have a Matrix of coefficients and Vector of intercepts respectively.
>>
>> I tried converting matrix to a vector by just taking the values (Guess
>> work) but got
>>
>> requirement failed: LogisticRegressionModel.load with numClasses = 3 and
>> numFeatures = 3 expected weights of length 6 (without intercept) or 8 (with
>> intercept), but was given weights of length 9
>>
>> So any ideas?
>>
>
>
