spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erik Erlandson <>
Subject UDAFs have an inefficiency problem
Date Wed, 27 Mar 2019 23:19:21 GMT
I describe some of the details here:

The short version of the story is that aggregating data structures (UDTs)
used by UDAFs are serialized to a Row object, and de-serialized, for every
row in a data frame.

View raw message