spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Xiangrui Meng (JIRA)" <j...@apache.org>
Subject [jira] [Resolved] (SPARK-3934) RandomForest bug in sanity check in DTStatsAggregator
Date Fri, 17 Oct 2014 22:04:33 GMT

     [ https://issues.apache.org/jira/browse/SPARK-3934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Xiangrui Meng resolved SPARK-3934.
----------------------------------
       Resolution: Fixed
    Fix Version/s: 1.2.0

Issue resolved by pull request 2785
[https://github.com/apache/spark/pull/2785]

> RandomForest bug in sanity check in DTStatsAggregator
> -----------------------------------------------------
>
>                 Key: SPARK-3934
>                 URL: https://issues.apache.org/jira/browse/SPARK-3934
>             Project: Spark
>          Issue Type: Bug
>          Components: MLlib
>            Reporter: Joseph K. Bradley
>            Assignee: Joseph K. Bradley
>             Fix For: 1.2.0
>
>
> When run with a mix of unordered categorical and continuous features, on multiclass classification,
RandomForest fails.  The bug is in the sanity checks in getFeatureOffset and getLeftRightFeatureOffsets,
which use the wrong indices for checking whether features are unordered.
> Proposal: Remove the sanity checks since they are not really needed, and since they would
require DTStatsAggregator to keep track of an extra set of indices (for the feature subset).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message