spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Nick Pentreath (JIRA)" <>
Subject [jira] [Resolved] (SPARK-14084) Parallel training jobs in model selection
Date Fri, 24 Feb 2017 07:22:44 GMT


Nick Pentreath resolved SPARK-14084.
          Resolution: Duplicate
    Target Version/s:   (was: )

> Parallel training jobs in model selection
> -----------------------------------------
>                 Key: SPARK-14084
>                 URL:
>             Project: Spark
>          Issue Type: New Feature
>          Components: ML
>    Affects Versions: 2.0.0
>            Reporter: Xiangrui Meng
> In CrossValidator and TrainValidationSplit, we run training jobs one by one. If users
have a big cluster, they might see speed-ups if we parallelize the job submission on the driver.
The trade-off is that we might need to make multiple copies of the training data, which could
be expensive. It is worth testing and figure out the best way to implement it.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message