spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ndjido Ardo BAR <ndj...@gmail.com>
Subject Re: Labeledpoint
Date Tue, 21 Jun 2016 17:23:35 GMT
To answer more accurately to your question, the model.fit(df) method takes
in a DataFrame of Row(label=double, feature=Vectors.dense([...])) .

cheers,
Ardo.


On Tue, Jun 21, 2016 at 6:44 PM, Ndjido Ardo BAR <ndjido@gmail.com> wrote:

> Hi,
>
> You can use a RDD of LabelPoints to fit your model. Check the doc for more
> example :
> http://spark.apache.org/docs/latest/api/python/pyspark.ml.html?highlight=transform#pyspark.ml.classification.RandomForestClassificationModel.transform
>
> cheers,
> Ardo.
>
> On Tue, Jun 21, 2016 at 6:12 PM, pseudo oduesp <pseudo20140@gmail.com>
> wrote:
>
>> Hi,
>> i am pyspark user and i want test Randomforest.
>>
>> i have dataframe with 100 columns
>> i should give Rdd or data frame to algorithme i transformed my dataframe
>> to only tow columns
>> label ands features  columns
>>
>>  df.label df.features
>>   0            (517,(0,1,2,333,56 ...
>>    1           (517,(0,11,0,33,6 ...
>>     0           (517,(0,1,0,33,8 ...
>>
>> but i dont have no ieda to transorme data frame like input to data frame
>> i test the example in offciel web page without succes
>>
>> please give me example how i can work and specily with test set  .
>>
>> thanks
>>
>
>

Mime
View raw message