spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Reminia Scarlet <reminia.scar...@gmail.com>
Subject Re: SparkStreming logical plan leaf nodes is not equal pysical plan leaf nodes and streaming metrics cannot be reported.
Date Wed, 23 Oct 2019 14:21:03 GMT
@Jungtaek
I'm using  Spark 2.4 (HDI 4.0)  in Azure.
Maybe there are other corner cases not taking into consideration.
Also I will decompile the spark jar from Azure to check the source code .

On Wed, Oct 23, 2019 at 9:39 PM Jungtaek Lim <kabhwan.opensource@gmail.com>
wrote:

> Which version of Spark you are using?
> I guess there was relevant issue SPARK-24050 [1] which was fixed in Spark
> 2.4.0 so you may want to check the latest version out and try if you use
> lower version.
>
> - Jungtaek Lim (HeartSaVioR)
>
> 1. https://issues.apache.org/jira/browse/SPARK-24050
>
> On Wed, Oct 23, 2019 at 9:57 PM Reminia Scarlet <reminia.scarlet@gmail.com>
> wrote:
>
>> Hi all:
>>  I use StreamingQueryListener to report batch inputRecordsNum as metrics.
>>  But the numInputRows is aways 0. And the debug log  in
>> MicroBatchExecution.scala said:
>>
>>  2019-10-23 06:56:05 WARN  MicroBatchExecution:66 - Could not report metrics as number
leaves in trigger logical plan did not match that of the execution plan:
>>
>>  And this causes num input rows by sources always 0 from below codes in ProgressReporter.scala
when number of leaves size not matches in logical plan and execution plan.
>>
>> [image: image.png]
>> Attached the output logical plan && physical plan leaves. I think there might
be some bugs. Seems LogicalRDD is duplicate as Relation in the logical plan.
>> And counting twice as leaf.If we remove the LogcialRDD, leave size should be the
same.
>>
>> [image: image.png]
>> [image: image.png]
>>
>> Can anyone help? Thx very much.
>>
>>

Mime
View raw message