spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Koert Kuipers <ko...@tresata.com>
Subject spark structured streaming with file based sources and sinks
Date Mon, 06 Aug 2018 16:31:38 GMT
has anyone used spark structured streaming from/to files (json, csv,
parquet, avro) in a non-test setting?

i realize kafka is probably the way to go, but lets say i have a situation
where kafka is not available for reasons out of my control, and i want to
do micro-batching. could i use files to do so in a production setting?
basically:

files on hdfs => spark structured streaming => files on hdfs => spark
structured streaming => files on hdfs => etc.

i assumed this is not a good idea but interested to hear otherwise.

Mime
View raw message