The OneHotEncoder does not accept multiple columns.

You can use Michal's suggestion where he uses Pipeline to set the stages and then executes them. 

The other option is to write a function that performs one hot encoding on a column and returns a dataframe with the encoded column and then call it multiple times for the rest of the columns.




On Wed, Aug 17, 2016 at 10:59 AM, janardhan shetty <janardhanp22@gmail.com> wrote:
I had already tried this way :

scala> val featureCols = Array("category","newone")
featureCols: Array[String] = Array(category, newone)

scala>  val indexer = new StringIndexer().setInputCol(featureCols).setOutputCol("categoryIndex").fit(df1)
<console>:29: error: type mismatch;
 found   : Array[String]
 required: String
        val indexer = new StringIndexer().setInputCol(featureCols).setOutputCol("categoryIndex").fit(df1)


On Wed, Aug 17, 2016 at 10:56 AM, Nisha Muktewar <nisha@cloudera.com> wrote:
I don't think it does. From the documentation: https://spark.apache.org/docs/2.0.0-preview/ml-features.html#onehotencoder, I see that it still accepts one column at a time.

On Wed, Aug 17, 2016 at 10:18 AM, janardhan shetty <janardhanp22@gmail.com> wrote:
2.0:

One hot encoding currently accepts single input column is there a way to include multiple columns ?