spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Timsina, Prem" <prem.tims...@mssm.edu>
Subject Apache Spark: Parallelization of Multiple Machine Learning ALgorithm
Date Sun, 03 Sep 2017 21:50:40 GMT
Is there a way to parallelize multiple ML algorithms in Spark. My use case is something like
this:
A) Run multiple machine learning algorithm (Naive Bayes, ANN, Random Forest, etc.) in parallel.
1) Validate each algorithm using 10-fold cross-validation
B) Feed the output of step A) in second layer machine learning algorithm.
My question is:
Can we run multiple machine learning algorithm in step A in parallel?
Can we do cross-validation in parallel? Like, run 10 iterations of Naive Bayes training in
parallel?

I was not able to find any way to run the different algorithm in parallel. And it seems cross-validation
also can not be done in parallel.
I appreciate any suggestion to parallelize this use case.

Prem
Mime
View raw message