spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From ayan guha <guha.a...@gmail.com>
Subject Re: Replicating a row n times
Date Fri, 29 Sep 2017 02:21:47 GMT
How about using row number for primary key?

Select row_number() over (), * from table

On Fri, 29 Sep 2017 at 10:21 am, Kanagha Kumar <kprasad@salesforce.com>
wrote:

> Hi,
>
> I'm trying to replicate a single row from a dataset n times and create a
> new dataset from it. But, while replicating I need a column's value to be
> changed for each replication since it would be end up as the primary key
> when stored finally.
>
> Looked at the following reference:
> https://stackoverflow.com/questions/40397740/replicate-spark-row-n-times
>
> import org.apache.spark.sql.functions._
> val result = singleRowDF
>   .withColumn("dummy", explode(array((1 until 100).map(lit): _*)))
>   .selectExpr(singleRowDF.columns: _*)
>
> How can I create a column from an array of values in Java and pass it to
> explode function? Suggestions are helpful.
>
>
> Thanks
> Kanagha
>
-- 
Best Regards,
Ayan Guha

Mime
View raw message