spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jungtaek Lim <>
Subject Re: TextSocketMicroBatchReader no longer supports nc utility
Date Tue, 05 Jun 2018 22:42:58 GMT
FYI: Filed and provided
the patch

2018년 6월 5일 (화) 오전 11:30, Jungtaek Lim <>님이 작성:

> Yeah that's why I initiated this thread, especially socket source is
> expected to be used from examples on official document or some experiments,
> which we tend to simply use netcat.
> I'll file an issue and provide the fix.
> 2018년 6월 5일 (화) 오전 1:48, Joseph Torres <>님이
> 작성:
>> I tend to agree that this is a bug. It's kinda silly that nc does this,
>> but a socket connector that doesn't work with netcat will surely seem
>> broken to users. It wouldn't be a huge change to defer opening the socket
>> until a read is actually required.
>> On Sun, Jun 3, 2018 at 9:55 PM, Jungtaek Lim <> wrote:
>>> Hi devs,
>>> Not sure I can hear back the response sooner since Spark summit is just
>>> around the corner, but just would want to post and wait.
>>> While playing with Spark 2.4.0-SNAPSHOT, I found nc command exits before
>>> reading actual data so the query also exits with error.
>>> The reason is due to launching temporary reader for reading schema, and
>>> closing reader, and re-opening reader. While reliable socket server should
>>> be able to handle this without any issue, nc command normally can't handle
>>> multiple connections and simply exits when closing temporary reader.
>>> I would like to file an issue and contribute on fixing this if we think
>>> this is a bug (otherwise we need to replace nc utility with another one,
>>> maybe our own implementation?), but not sure we are happy to apply
>>> workaround for specific source.
>>> Would like to hear opinions before giving a shot.
>>> Thanks,
>>> Jungtaek Lim (HeartSaVioR)

View raw message