spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Leonard Cohen" <3498363...@qq.com>
Subject Re: How to convert String to Vector ?
Date Tue, 06 Sep 2016 10:09:35 GMT
hi,


map(feature => List(feature).split(',') )
in python:
list(string.split(',')) : 
eval(string)




http://stackoverflow.com/questions/31376574/spark-rddstring-string-into-rddmapstring-string




------------------ Original ------------------
From:  "颜发才(Yan Facai)";<yafc18@gmail.com>;
Send time: Tuesday, Sep 6, 2016 5:56 PM
To: "user.spark"<user@spark.apache.org>; 

Subject:  How to convert String to Vector ?



Hi, 

I have a csv file like:

uid      mid      features       label

123    5231    [0, 1, 3, ...]    True

Both  "features" and "label" columns are used for GBTClassifier.



However, when I read the file:

Dataset<Row> samples = sparkSession.read().csv(file);
The type of samples.select("features") is String.


My question is:

How to map samples.select("features") to Vector or any appropriate type,

so I can use it to train like:
        GBTClassifier gbdt = new GBTClassifier()
                .setLabelCol("label")
                .setFeaturesCol("features")
                .setMaxIter(2)
                .setMaxDepth(7);



Thanks.
Mime
View raw message