spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sean Owen <so...@cloudera.com>
Subject Re: Does DecisionTree model in MLlib deal with missing values?
Date Sun, 11 Jan 2015 11:51:37 GMT
I do not recall seeing support for missing values.

Categorical values are encoded as 0.0, 1.0, 2.0, ... When training the
model you indicate which are interpreted as categorical with the
categoricalFeaturesInfo parameter, which maps feature offset to count
of distinct categorical values for the feature.

On Sun, Jan 11, 2015 at 6:54 AM, Carter <gyzhen@hotmail.com> wrote:
> Hi, I am new to the MLlib in Spark. Can the DecisionTree model in MLlib deal
> with missing values? If so, what data structure should I use for the input?
>
> Moreover, my data has categorical features, but the LabeledPoint requires
> "double" data type, in this case what can I do?
>
> Thank you very much.
>
>
>
> --
> View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Does-DecisionTree-model-in-MLlib-deal-with-missing-values-tp21080.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
> For additional commands, e-mail: user-help@spark.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org


Mime
View raw message