spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Xiangrui Meng <>
Subject Re: [mllib] Refactoring some spark.mllib model classes in Python not inheriting JavaModelWrapper
Date Thu, 18 Jun 2015 19:11:57 GMT
Hi Yu,

Reducing the code complexity on the Python side is certainly what we
want to see:) We didn't call Java directly in Python models because
Java methods don't work inside RDD closures, e.g., x: model.predict(x[1]))

But I agree that for model save/load the implementation should be
simplified. Could you submit a PR and see how much code we can save?


On Wed, Jun 17, 2015 at 8:15 PM, Yu Ishikawa
<> wrote:
> Hi all,
> I think we should refactor some machine learning model classes in Python to
> reduce the software maintainability.
> Inheriting JavaModelWrapper class, we can easily and directly call Scala API
> for the model without PythonMLlibAPI.
> In some case, a machine learning model class in Python has complicated
> variables. That is, it is a little hard to implement import/export methods
> and it is also a little troublesome to implement the function in both of
> Scala and Python. And I think standardizing how to create a model class in
> python is important.
> What do you think about that?
> Thanks,
> Yu
> -----
> -- Yu Ishikawa
> --
> View this message in context:
> Sent from the Apache Spark Developers List mailing list archive at
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message