spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From DB Tsai <dbt...@stanford.edu>
Subject Re: Using String Dataset for Logistic Regression
Date Fri, 16 May 2014 05:52:47 GMT
You could also use dummy coding to convert categorical feature to
numeric feature.

http://en.wikipedia.org/wiki/Categorical_variable#Dummy_coding

Sincerely,

DB Tsai
-------------------------------------------------------
My Blog: https://www.dbtsai.com
LinkedIn: https://www.linkedin.com/in/dbtsai


On Wed, May 14, 2014 at 10:37 PM, Xiangrui Meng <mengxr@gmail.com> wrote:
> It depends on how you want to use the string features. For the day of
> the week, you can replace it with 6 binary features indicating
> Mon/Tue/Wed/Th/Fri/Sat. -Xiangrui
>
> On Fri, May 9, 2014 at 5:31 AM, praveshjain1991
> <praveshjain1991@gmail.com> wrote:
>> I have been trying to use LR in Spark's Java API. I used the dataset given
>> along with Spark for the training and testing purposes.
>>
>> Now i want to use it on another dataset that contains string values along
>> with numbers. Is there any way to do this?
>>
>> I am attaching the Dataset that i want to use.
>>
>> Thanks and Regards, Test.data
>> <http://apache-spark-user-list.1001560.n3.nabble.com/file/n5523/Test.data>
>>
>>
>>
>> --
>> View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Using-String-Dataset-for-Logistic-Regression-tp5523.html
>> Sent from the Apache Spark User List mailing list archive at Nabble.com.

Mime
View raw message