spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mendelson, Assaf" <Assaf.Mendel...@rsa.com>
Subject RE: DataFrame select non-existing column
Date Sat, 19 Nov 2016 06:45:09 GMT
In pyspark for example you would do something like:

df.withColumn("newColName",pyspark.sql.functions.lit(None))

Assaf.
-----Original Message-----
From: Kristoffer Sjögren [mailto:stoffe@gmail.com] 
Sent: Friday, November 18, 2016 9:19 PM
To: Mendelson, Assaf
Cc: user
Subject: Re: DataFrame select non-existing column

Thanks for your answer. I have been searching the API for doing that but I could not find
how to do it?

Could you give me a code snippet?

On Fri, Nov 18, 2016 at 8:03 PM, Mendelson, Assaf <Assaf.Mendelson@rsa.com> wrote:
> You can always add the columns to old dataframes giving them null (or some literal) as
a preprocessing.
>
> -----Original Message-----
> From: Kristoffer Sjögren [mailto:stoffe@gmail.com]
> Sent: Friday, November 18, 2016 4:32 PM
> To: user
> Subject: DataFrame select non-existing column
>
> Hi
>
> We have evolved a DataFrame by adding a few columns but cannot write select statements
on these columns for older data that doesn't have them since they fail with a AnalysisException
with message "No such struct field".
>
> We also tried dropping columns but this doesn't work for nested columns.
>
> Any non-hacky ways to get around this?
>
> Cheers,
> -Kristoffer
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: user-unsubscribe@spark.apache.org
>
Mime
View raw message