spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Axel Dahl <a...@whisperstream.com>
Subject performance when checking if data frame is empty or not
Date Tue, 08 Sep 2015 20:22:26 GMT
I have a join, that fails when one of the data frames is empty.

To avoid this I am hoping to check if the dataframe is empty or not before
the join.

The question is what's the most performant way to do that?

should I do df.count() or df.first() or something else?

Thanks in advance,

-Axel

Mime
View raw message