spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Thomas Darimont (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (SPARK-1406) PMML model evaluation support via MLib
Date Thu, 03 Apr 2014 12:51:15 GMT

    [ https://issues.apache.org/jira/browse/SPARK-1406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13958776#comment-13958776
] 

Thomas Darimont edited comment on SPARK-1406 at 4/3/14 12:49 PM:
-----------------------------------------------------------------

Hi Sean,

thanks for responding so quickly :)

Sure you can do that of course (thats what I currently do), but there IMHO many interesting
use cases that would benefit from having direct PMML support, e.g.:
1) Initialize an algorithm with a set of prepared parameters by loading a PMML file and evaluate
the algorthm with spark's infrastructure.
2) Abstract the configuration or construction of an Algorithm via some kind of Producer that
gets the PMML model as an Input and returns a fully configured Spark representation of the
algorithm which is encoded in the PMML.
3) Support hot-replacing an algorithm (configuration) at runtime by just providing an updated
PMML model to the spark infrastructure.
4) Use the transformation / normalization or even dynamic model selection support build into
PMML to select the appropriate algorithm (configuration) based on the input.

You could even use JPMML to get the PMML object model as a starting point.


was (Author: thomasd):
Hi Sean,

thanks for responding so quickly :)

Sure you can do that of course (thats what I currently do), but there IMHO many interesting
use cases that would benefit from having direct PMML support, e.g.:
1) Initialize an algorithm with a set of prepared parameters by loading a PMML file and evaluate
the algorthm with spark's infrastructure.
2) Abstract the configuration or construction of an Algorithm via some kind of Producer that
gets the PMML model as an Input and returns a fully configured Spark representation of the
algorithm which is encoded in the PMML.
3) Support hot-replacing an algorithm (configuration) at runtime by just providing an updated
PMML model to the spark infrastructure.

You could even use JPMML to get the PMML object model as a starting point.

> PMML model evaluation support via MLib
> --------------------------------------
>
>                 Key: SPARK-1406
>                 URL: https://issues.apache.org/jira/browse/SPARK-1406
>             Project: Spark
>          Issue Type: New Feature
>          Components: MLlib
>            Reporter: Thomas Darimont
>
> It would be useful if spark would provide support the evaluation of PMML models (http://www.dmg.org/v4-2/GeneralStructure.html).
> This would allow to use analytical models that were created with a statistical modeling
tool like R, SAS, SPSS, etc. with Spark (MLib) which would perform the actual model evaluation
for a given input tuple. The PMML model would then just contain the "parameterization" of
an analytical model.
> Other projects like JPMML-Evaluator do a similar thing.
> https://github.com/jpmml/jpmml/tree/master/pmml-evaluator



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message