spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michał Zieliński <zielinski.mich...@gmail.com>
Subject Re: VectorUDT with spark.ml.linalg.Vector
Date Wed, 17 Aug 2016 18:46:12 GMT
I'm using Spark 1.6.2 for Vector-based UDAF and this works:

def inputSchema: StructType = new StructType().add("input", new VectorUDT())

Maybe it was made private in 2.0

On 17 August 2016 at 05:31, Alexey Svyatkovskiy <alexeys@princeton.edu>
wrote:

> Hi Yanbo,
>
> Thanks for your reply. I will keep an eye on that pull request.
> For now, I decided to just put my code inside org.apache.spark.ml to be
> able to access private classes.
>
> Thanks,
> Alexey
>
> On Tue, Aug 16, 2016 at 11:13 PM, Yanbo Liang <ybliang8@gmail.com> wrote:
>
>> It seams that VectorUDT is private and can not be accessed out of Spark
>> currently. It should be public but we need to do some refactor before make
>> it public. You can refer the discussion at https://github.com/apache/s
>> park/pull/12259 .
>>
>> Thanks
>> Yanbo
>>
>> 2016-08-16 9:48 GMT-07:00 alexeys <alexeys@princeton.edu>:
>>
>>> I am writing an UDAF to be applied to a data frame column of type Vector
>>> (spark.ml.linalg.Vector). I rely on spark/ml/linalg so that I do not
>>> have to
>>> go back and forth between dataframe and RDD.
>>>
>>> Inside the UDAF, I have to specify a data type for the input, buffer, and
>>> output (as usual). VectorUDT is what I would use with
>>> spark.mllib.linalg.Vector:
>>> https://github.com/apache/spark/blob/master/mllib/src/main/s
>>> cala/org/apache/spark/mllib/linalg/Vectors.scala
>>>
>>> However, when I try to import it from spark.ml instead: import
>>> org.apache.spark.ml.linalg.VectorUDT
>>> I get a runtime error (no errors during the build):
>>>
>>> class VectorUDT in package linalg cannot be accessed in package
>>> org.apache.spark.ml.linalg
>>>
>>> Is it expected/can you suggest a workaround?
>>>
>>> I am using Spark 2.0.0
>>>
>>> Thanks,
>>> Alexey
>>>
>>>
>>>
>>> --
>>> View this message in context: http://apache-spark-user-list.
>>> 1001560.n3.nabble.com/VectorUDT-with-spark-ml-linalg-Vector-tp27542.html
>>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe e-mail: user-unsubscribe@spark.apache.org
>>>
>>>
>>
>

Mime
View raw message