spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From John Omernik <>
Subject Spark Streaming for Each RDD - Exception on Empty
Date Fri, 05 Jun 2015 15:08:13 GMT
Is there pythonic/sparkonic way to test for an empty RDD before using the
foreachRDD?  Basically I am using the Python example to
"put records somewhere"  When I have data, it works fine, when I don't I
get an exception. I am not sure about the performance implications of just
throwing an exception every time there is no data, but can I just test
before sending it?

I did see one post mentioning look for take(1) from the stream to test for
data, but I am not sure where I put that in this example... Is that in the
lambda function? or somewhere else? Looking for pointers!

mydstream.foreachRDD(lambda rdd: rdd.foreachPartition(parseRDD))

Using this example code from the link above:

def sendPartition(iter):
    connection = createNewConnection()
    for record in iter:
dstream.foreachRDD(lambda rdd: rdd.foreachPartition(sendPartition))

View raw message