spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jacek Laskowski <ja...@japila.pl>
Subject Re: help coercing types
Date Fri, 18 Mar 2016 21:24:01 GMT
Hi,

Just a side question: why do you convert DataFrame to RDD? It's like
driving backwards (possible but ineffective and dangerous at times)

P. S. I'd even go for Dataset.

Jacek
18.03.2016 5:20 PM "Bauer, Robert" <Robert.Bauer@asurion.com> napisał(a):

> I have data that I pull in using a sql context and then I convert to an
> rdd.
>
>
>
> The problem is that the type in the rdd is [Any, Iterable[Any]]
>
>
>
> And I need to have the type RDD[Array[String]]   -- convert the Iterable
> to an Array.
>
>
>
> Here’s more detail:
>
>
>
> val zdata = sqlContext.read.parquet("s3://.. parquet").select('Pk,
> explode('Pg) as "P").select($"Pk", $"P.A.n")
>
>
>
> val r1data = zdata.rdd
>
>
>
> val r2data = r1data.map(t => (t(0),t(1))).groupByKey()
>
>
>
> and at this point r2data’s type is [Any, Iterable[Any]]
>
>
>
> robert
>
>
>
> ------------------------------
>
> This message (including any attachments) contains confidential and/or
> privileged information. It is intended for a specific individual and
> purpose and is protected by law. If you are not the intended recipient,
> please notify the sender immediately and delete this message. Any
> disclosure, copying, or distribution of this message, or the taking of any
> action based on it, is strictly prohibited.
>

Mime
View raw message