spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Himanshu Mohan <>
Subject RE: MatrixUDT and VectorUDT in Spark ML
Date Sat, 24 Mar 2018 04:39:00 GMT
I agree


From: Li Jin []
Sent: Friday, March 23, 2018 8:24 PM
To: dev <>
Subject: MatrixUDT and VectorUDT in Spark ML

Hi All,

I came across these two types MatrixUDT and VectorUDF in Spark ML when doing feature extraction
and preprocessing with PySpark. However, when trying to do some basic operations, such as
vector multiplication and matrix multiplication, I had to go down to Python UDF.

It seems to be it would be very useful to have built-in operators on these types just like
first class Spark SQL types, e.g.,

df.withColumn('v', df.matrix_column * df.vector_column)

I wonder what are other people's thoughts on this?


American Express made the following annotations
"This message and any attachments are solely for the intended recipient and may contain confidential
or privileged information. If you are not the intended recipient, any disclosure, copying,
use, or distribution of the information included in this message and any attachments is prohibited.
If you have received this communication in error, please notify us by reply e-mail and immediately
and permanently delete this message and any attachments. Thank you."

American Express a ajouté le commentaire suivant le Ce courrier et toute pièce jointe qu'il
contient sont réservés au seul destinataire indiqué et peuvent renfermer des 
renseignements confidentiels et privilégiés. Si vous n'êtes pas le destinataire prévu,
toute divulgation, duplication, utilisation ou distribution du courrier ou de toute pièce
jointe est interdite. Si vous avez reçu cette communication par erreur, veuillez nous en
aviser par courrier et détruire immédiatement le courrier et les pièces jointes. Merci.

View raw message