spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Joseph K. Bradley (JIRA)" <>
Subject [jira] [Commented] (SPARK-7132) Add fit with validation set to GBT
Date Wed, 09 Sep 2015 01:02:45 GMT


Joseph K. Bradley commented on SPARK-7132:

How would the split be chosen?  It will be important for the user to be able to specify the
split; for that, an extra Boolean column seems like a reasonable choice.

> Add fit with validation set to GBT
> -------------------------------------------
>                 Key: SPARK-7132
>                 URL:
>             Project: Spark
>          Issue Type: Improvement
>          Components: ML
>            Reporter: Joseph K. Bradley
>            Priority: Minor
> In spark.mllib GradientBoostedTrees, we have a method runWithValidation which takes a
validation set.  We should add that to the API.
> This will require a bit of thinking about how the Pipelines API should handle a validation
set (since Transformers and Estimators only take 1 input DataFrame).  The current plan is
to include an extra column in the input DataFrame which indicates whether the row is for training,
validation, etc.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message