spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From WangXiaolong <>
Subject Structured Streaming doesn't write checkpoint log when I use coalesce
Date Thu, 09 Aug 2018 11:38:03 GMT

   Lately, I encountered a problem, when I was writing as structured streaming job to write
things into opentsdb.
  The write-stream part looks something like 

          .trigger(Trigger.ProcessingTime(s"$triggerSeconds seconds"))
          .foreach {
              MongoProp(mongoUrl, mongoPort, mongoUser, mongoPassword, mongoDatabase, mongoCollection,mongoAuthenticationDatabase)

    And when I check the checkpoint dir, I discover that the "/checkpoint/state" dir  is empty.
I looked into the executor's log and found that the HDFSBackedStateStoreProvider didn't write
anything on the checkpoint dir.

   Strange thing is, when I replace the "coalesce" function into "repartition" function, the
problem solved. Is there a difference between these two functions when using structured streaming?

  Looking forward to you help, thanks.

View raw message