spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Amit Sharma <resolve...@gmail.com>
Subject Re: help on use case - spark parquet processing
Date Fri, 14 Aug 2020 02:21:12 GMT
Can you keep option field in your case class.


Thanks
Amit

On Thu, Aug 13, 2020 at 12:47 PM manjay kumar <manjay.machine@gmail.com>
wrote:

> Hi ,
>
> I have a use case,
>
> where i need to merge three data set and build one where ever data is
> available.
>
> And my dataset is a complex object.
>
> Customer
> - name - string
> - accounts - List<Account>
>
> Account
> - type - String
> - Adressess - List<Address>
>
> Address
> -name - String
>
> ----
>
> ---
>
>
> And it goes on.
>
> These file are in parquet ,
>
>
> All 3 input datasets are having some details , which need to merge.
>
> And build one dataset , which has all the information ( i know the files
> which need to merge )
>
>
> I want to know , how should I proceed on this  ??
>
> - my approach is to build case class of actual output and parse the three
> dataset.
>  ( but this is failing because the input response have not all the fields).
>
> So basically , what should be the approach to deal this kind of problem ?
>
> 2nd , how can i convert parquet dataframe to dataset, considering the
> pauquet struct does not have all the fields. but case class has all the
> field ( i am getting error no struct type found)
>
> Thanks
> Manjay Kumar
> 8320 120 839
>
>
>

Mime
View raw message