spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sean Owen <so...@cloudera.com>
Subject Re: Spark random forest - string data
Date Fri, 16 Jan 2015 21:45:42 GMT
The implementation accepts an RDD of LabeledPoint only, so you
couldn't feed in strings from a text file directly. LabeledPoint is a
wrapper around double values rather than strings. How were you trying
to create the input then?

No, it only accepts numeric values, although you can encode
categorical values as 0, 1, 2 ... and tell the implementation about
your categorical features to use categorical features.

On Fri, Jan 16, 2015 at 9:25 PM, Asaf Lahav <asaf.lahav@gmail.com> wrote:
> Hi,
>
> I have been playing around with the new version of Spark MLlib Random forest
> implementation, and while in the process, tried it with a file with String
> Features.
> While training, it fails with:
> java.lang.NumberFormatException: For input string.
>
>
> Is MBLib Random forest adapted to run on top of numeric data only?
>
> Thanks

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org


Mime
View raw message