spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Debabrata Ghosh <mailford...@gmail.com>
Subject Re: Efficient way to compare the current row with previous row contents
Date Mon, 12 Feb 2018 14:12:47 GMT
Georg - Thanks ! Will you be able to help me with a few examples please.

Thanks in advance again !

Cheers,
D

On Mon, Feb 12, 2018 at 6:03 PM, Georg Heiler <georg.kf.heiler@gmail.com>
wrote:

> You should look into window functions for spark sql.
> Debabrata Ghosh <mailfordebu@gmail.com> schrieb am Mo. 12. Feb. 2018 um
> 13:10:
>
>> Hi,
>>                  Greetings !
>>
>>                  I needed some efficient way in pyspark to execute a
>> comparison (on all the attributes) between the current row and the previous
>> row. My intent here is to leverage the distributed framework of Spark to
>> the best extent so that can achieve a good speed. Please can anyone suggest
>> me a suitable algorithm / command. Here is a snapshot of the underlying
>> data which I need to compare:
>>
>> [image: Inline image 1]
>>
>> Thanks in advance !
>>
>> D
>>
>

Mime
View raw message