spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From M Singh <mans2si...@yahoo.com.INVALID>
Subject Apache Spark - Structured Streaming StreamExecution Stats Description
Date Wed, 28 Mar 2018 17:10:38 GMT
Hi:
I am using spark structured streaming 2.2.1 and am using flatMapGroupWithState and a groupBy
count operators.
 In the StreamExecution logs I see two enteries for stateOperators
"stateOperators" : [ {
    "numRowsTotal" : 1617339,
    "numRowsUpdated" : 9647
  }, {
    "numRowsTotal" : 1326355,
    "numRowsUpdated" : 1398672
  } ],
My questions are:1. Is there way to figure out which stats is for flatMapGroupWithState and
which one for groupBy count ?  In my case, I can guess based on my data but want to be definitive
about it.2. For the second stats - how can the numRowsTotal (1326355) be less than numRowsUpdated
(1398672) ?
If there in documentation I can use to understand the debug output, please let me know.

Thanks

Mime
View raw message