spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ganelin, Ilya" <>
Subject RE: spark challenge: zip with next???
Date Thu, 29 Jan 2015 21:32:23 GMT
Make a copy of your RDD with an extra entry in the beginning to offset. The you can zip the
two RDDs and run a map to generate an RDD of differences.

Sent with Good (

-----Original Message-----
From: derrickburns [<>]
Sent: Thursday, January 29, 2015 02:52 PM Eastern Standard Time
Subject: spark challenge: zip with next???

Here is a spark challenge for you!

I have a data set where each entry has a date.  I would like to identify
gaps in the dates greater larger a given length.  For example, if the data
were log entries, then the gaps would tell me when I was missing log data
for long periods of time. What is the most efficient way to achieve this in

View this message in context:
Sent from the Apache Spark User List mailing list archive at

To unsubscribe, e-mail:
For additional commands, e-mail:


The information contained in this e-mail is confidential and/or proprietary to Capital One
and/or its affiliates. The information transmitted herewith is intended only for use by the
individual or entity to which it is addressed.  If the reader of this message is not the intended
recipient, you are hereby notified that any review, retransmission, dissemination, distribution,
copying or other use of, or taking of any action in reliance upon this information is strictly
prohibited. If you have received this communication in error, please contact the sender and
delete the material from your computer.
View raw message