spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ashwin Raju <>
Subject Spark 2.2 streaming with append mode: empty output
Date Mon, 14 Aug 2017 23:09:03 GMT

I am running Spark 2.2 and trying out structured streaming. I have the
following code:

from pyspark.sql import functions as F

df=frame \

    .withWatermark("timestamp","1 minute") \

    .groupby(F.window("timestamp","1 day"),*groupby_cols) \


query = frame.writeStream \


.option("checkpointLocation", '\some\chkpoint')




It prints out a bunch of aggregated rows to console. When I run the same
query with outputMode("append") however, the output only has the column
names, no rows. I was originally trying to output to parquet, which only
supports append mode. I was seeing no data in my parquet files, so I
switched to console output to debug, then noticed this issue. Am I
misunderstanding something about how append mode works?



View raw message