spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Patrick McCarthy <pmccar...@dstillery.com>
Subject Re: Apache Spark: Parallelization of Multiple Machine Learning ALgorithm
Date Tue, 05 Sep 2017 13:56:38 GMT
You might benefit from watching this JIRA issue -
https://issues.apache.org/jira/browse/SPARK-19071

On Sun, Sep 3, 2017 at 5:50 PM, Timsina, Prem <prem.timsina@mssm.edu> wrote:

> Is there a way to parallelize multiple ML algorithms in Spark. My use case
> is something like this:
>
> A) Run multiple machine learning algorithm (Naive Bayes, ANN, Random
> Forest, etc.) in parallel.
>
> 1) Validate each algorithm using 10-fold cross-validation
>
> B) Feed the output of step A) in second layer machine learning algorithm.
>
> My question is:
>
> Can we run multiple machine learning algorithm in step A in parallel?
>
> Can we do cross-validation in parallel? Like, run 10 iterations of Naive
> Bayes training in parallel?
>
>
>
> I was not able to find any way to run the different algorithm in parallel.
> And it seems cross-validation also can not be done in parallel.
>
> I appreciate any suggestion to parallelize this use case.
>
>
>
> Prem
>

Mime
View raw message