spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Patrick Wendell (JIRA)" <j...@apache.org>
Subject [jira] [Created] (SPARK-1579) PySpark should distinguish expected IOExceptions from unexpected ones in the worker
Date Wed, 23 Apr 2014 01:51:14 GMT
Patrick Wendell created SPARK-1579:
--------------------------------------

             Summary: PySpark should distinguish expected IOExceptions from unexpected ones
in the worker
                 Key: SPARK-1579
                 URL: https://issues.apache.org/jira/browse/SPARK-1579
             Project: Spark
          Issue Type: Improvement
          Components: PySpark
            Reporter: Patrick Wendell
            Assignee: Aaron Davidson
             Fix For: 1.1.0


I chatted with [~adav] a bit about this. Right now we drop IOExceptions because they are (in
some cases) expected if a Python worker returns before consuming its entire input. The issue
is this swallows legitimate IO exceptions when they occur.

One thought we had was to change the daemon.py file to, instead of closing the socket when
the function is over, simply busy-wait on the socket being closed. We'd transfer the responsibility
for closing the socket to the Java reader. The Java reader could, when it has finished consuming
output form Python, set a flag on a volatile variable to indicate that Python has fully returned.
Then if an IOException is found, we only swallow it if we are expecting it.

This would also let us remove the warning message right now.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message