spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ashwin Raju <ther...@gmail.com>
Subject Spark 2.2 streaming with append mode: empty output
Date Mon, 14 Aug 2017 23:09:03 GMT
Hi,

I am running Spark 2.2 and trying out structured streaming. I have the
following code:

from pyspark.sql import functions as F

df=frame \

    .withWatermark("timestamp","1 minute") \

    .groupby(F.window("timestamp","1 day"),*groupby_cols) \

    .agg(f.sum('bytes'))

query = frame.writeStream \

.format("console")

.option("checkpointLocation", '\some\chkpoint')

.outputMode("complete")

.start()



query.awaitTermination()



It prints out a bunch of aggregated rows to console. When I run the same
query with outputMode("append") however, the output only has the column
names, no rows. I was originally trying to output to parquet, which only
supports append mode. I was seeing no data in my parquet files, so I
switched to console output to debug, then noticed this issue. Am I
misunderstanding something about how append mode works?


Thanks,

Ashwin

Mime
View raw message