spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rishi Shah <rishishah.s...@gmail.com>
Subject [Pyspark 2.3] Logical operators (and/or) in pyspark
Date Mon, 13 May 2019 15:37:37 GMT
Hi All,

I am using or operator "|" in withColumn clause on a DataFrame in pyspark.
However it looks like it always evaluates all the conditions regardless of
first condition being true. Please find a sample below:

contains = udf(lambda s, arr : s in arr, BooleanType())

df.withColumn('match_flag', (col('list_names').isNull()) |
(contains(col('name'), col('list_names'))))

Here where list_names is null, it starts to throw an error : NoneType is
not iterable.

Any idea?

-- 
Regards,

Rishi Shah

Mime
View raw message