spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From ayan guha <guha.a...@gmail.com>
Subject Re: Explode/Flatten Map type Data Using Pyspark
Date Fri, 15 Nov 2019 05:29:05 GMT
Hi Anbutech in that case you have variable number of columns in output df
and then in csv. it will not be the best way to read csv

On Fri, 15 Nov 2019 at 2:30 pm, anbutech <anbutech17@outlook.com> wrote:

> Hello Guha,
>
> The  number of keys will be different for each event id.for example if the
> event id:005 it is has 10 keys then i have to flatten all those 10 keys in
> the final output.here there is no fixed number of keys for each event id.
>
> 001 -> 2 keys
>
> 002 -> 4 keys
>
> 003 -> 5 keys
>
> above event id has different key values combinations and different from
> other.i want to dynamically flatten the incoming data
>
> in the ouput s3 csv file(want to write all the flattened keys in the csv
> path)
>
> flatten.csv
>
> eve_id  k1    k2  k3
> 001       abc   x  y
>
> eve_id,  k1  k2   k3   k4
> 002,     12  jack 0.01 0998
>
> eve_id,   k1     k2        k3          k4      k5
> 003,       aaa  xxxx   device   endpoint     -
>
>
>
>
> --
> Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: user-unsubscribe@spark.apache.org
>
> --
Best Regards,
Ayan Guha

Mime
View raw message