spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From KhajaAsmath Mohammed <mdkhajaasm...@gmail.com>
Subject Re: Efficient way to compare the current row with previous row contents
Date Mon, 12 Feb 2018 14:16:34 GMT
I am also looking for the same answer. Will this work in streaming application too ?? 

Sent from my iPhone

> On Feb 12, 2018, at 8:12 AM, Debabrata Ghosh <mailfordebu@gmail.com> wrote:
> 
> Georg - Thanks ! Will you be able to help me with a few examples please.
> 
> Thanks in advance again !
> 
> Cheers,
> D
> 
>> On Mon, Feb 12, 2018 at 6:03 PM, Georg Heiler <georg.kf.heiler@gmail.com> wrote:
>> You should look into window functions for spark sql. 
>> Debabrata Ghosh <mailfordebu@gmail.com> schrieb am Mo. 12. Feb. 2018 um 13:10:
>>> Hi,
>>>                  Greetings !
>>> 
>>>                  I needed some efficient way in pyspark to execute a comparison
(on all the attributes) between the current row and the previous row. My intent here is to
leverage the distributed framework of Spark to the best extent so that can achieve a good
speed. Please can anyone suggest me a suitable algorithm / command. Here is a snapshot of
the underlying data which I need to compare:
>>> 
>>> 
>>> 
>>> Thanks in advance !
>>> 
>>> D
> 

Mime
View raw message