spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Colin Williams <colin.williams.seat...@gmail.com>
Subject Re: Casting nested columns and updated nested struct fields.
Date Fri, 23 Nov 2018 19:35:05 GMT
Looks like it's been reported already. It's too bad it's been a year
but should be released into spark 3:
https://issues.apache.org/jira/browse/SPARK-22231
On Fri, Nov 23, 2018 at 8:42 AM Colin Williams
<colin.williams.seattle@gmail.com> wrote:
>
> Seems like it's worthy of filing a bug against withColumn
>
> On Wed, Nov 21, 2018, 6:25 PM Colin Williams <colin.williams.seattle@gmail.com wrote:
>>
>> Hello,
>>
>> I'm currently trying to update the schema for a dataframe with nested
>> columns. I would either like to update the schema itself or cast the
>> column without having to explicitly select all the columns just to
>> cast one.
>>
>> In regards to updating the schema it looks like I would probably need
>> to write a more complex map on the schema to find the StructFields I
>> want  to update and update them. I haven't found any examples of this
>> but it seems like there should be a simpler way to do it.
>>
>> In regards to changing the column on the dataframe itself, using E.G.
>>
>> val newDF = df.withColumn("existing.top.level.FIELD_NAME",df.col("existing.top.level.FIELD_NAME").cast(LongType))
>>
>> I end up with a new column named "existing.top.level.FIELD_NAME" at
>> the root level vs updating the nested column to the new type. Then has
>> anybody worked out how to both update nested column datatype and also
>> how to update the column type from the nested schema StructType? Are
>> there any easy ways to do this or is there a reason it is not trivial?

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org


Mime
View raw message