spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Cyanny LIANG <lgrcya...@gmail.com>
Subject Fwd: Does pyspark worker really use pipe?
Date Thu, 20 Jul 2017 02:17:50 GMT
Hello,
As pyspark internals wiki said,
pyspark worker use pipe to communicate, not socket.
https://cwiki.apache.org/confluence/display/SPARK/PySpark+Internals

I have checked the pyspark/worker.py code:

if __name__ == '__main__':
    # Read a local port to connect to from stdin
    java_port = int(sys.stdin.readline())
    sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
    sock.connect(("127.0.0.1", java_port))
    sock_file = sock.makefile("rwb", 65536)
    main(sock_file, sock_file)

it actually uses socket not pipe, I am wondering that is there anything I
missed?
why pyspark worker use socket not pipe? for performance reason?

-- 
Best & Regards
Cyanny LIANG
email: lgrcyanny@gmail.com

Mime
View raw message