spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Hamel Kothari <hamelkoth...@gmail.com>
Subject ConvertToSafe being done before functions.explode
Date Thu, 28 Apr 2016 20:57:22 GMT
Hi all,

I've been looking at some of my query plans and noticed that pretty much
every explode that I run (which is always over a column with ArrayData) is
prefixed with a ConvertToSafe call in the physical plan. Looking at
Generate.scala it looks like it doesn't override canProcessUnsafeRows in
SparkPlan which defaults to false. For more clarity, I'm using
functions.explode (which uses builtin Explode from generators.scala), not
DataFrame.explode (which requires a user function to be passed in).

Is this behavior correct? I suspect that unless we're using a
UserDefinedGenerator this isn't the right. Even in the case of
UserDefinedGenerator it seems the UserDefinedGenerator expression code
performs a manual convertToSafe. If my understanding is correct we should
be able to set "canProcessUnsafeRows" to be true in all cases. Can someone
who understands this part of the SQL code spot check me on this?

Thanks,
Hamel

Mime
View raw message