I'm using Spark 1.6.2 for Vector-based UDAF and this works:

def inputSchema: StructType = new StructType().add("input", new VectorUDT())

Maybe it was made private in 2.0

Hi Yanbo,

Thanks for your reply. I will keep an eye on that pull request.
For now, I decided to just put my code inside org.apache.spark.ml to be able to access private classes.


It seams that VectorUDT is private and can not be accessed out of Spark currently. It should be public but we need to do some refactor before make it public. You can refer the discussion at https://github.com/apache/spark/pull/12259 .


I am writing an UDAF to be applied to a data frame column of type Vector
(spark.ml.linalg.Vector). I rely on spark/ml/linalg so that I do not have to
go back and forth between dataframe and RDD.

Inside the UDAF, I have to specify a data type for the input, buffer, and
output (as usual). VectorUDT is what I would use with

However, when I try to import it from spark.ml instead: import
I get a runtime error (no errors during the build):

class VectorUDT in package linalg cannot be accessed in package

Is it expected/can you suggest a workaround?

I am using Spark 2.0.0


