spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gabor Somogyi (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SPARK-27977) MicroBatchWriter should use StreamWriter for human-friendly textual representation (toString)
Date Wed, 03 Jul 2019 15:19:00 GMT

    [ https://issues.apache.org/jira/browse/SPARK-27977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16877918#comment-16877918
] 

Gabor Somogyi commented on SPARK-27977:
---------------------------------------

Yeah, the latest master looks like:
{code:java}
== Parsed Logical Plan ==
WriteToDataSourceV2 org.apache.spark.sql.execution.streaming.sources.MicroBatchWrite@3de866af
+- StreamingDataSourceV2Relation [value#1], org.apache.spark.sql.execution.streaming.MemoryStreamScanBuilder@74fd5bf3,
MemoryStream[value#1], 1, 2

== Analyzed Logical Plan ==

WriteToDataSourceV2 org.apache.spark.sql.execution.streaming.sources.MicroBatchWrite@3de866af
+- StreamingDataSourceV2Relation [value#1], org.apache.spark.sql.execution.streaming.MemoryStreamScanBuilder@74fd5bf3,
MemoryStream[value#1], 1, 2

== Optimized Logical Plan ==
WriteToDataSourceV2 org.apache.spark.sql.execution.streaming.sources.MicroBatchWrite@3de866af
+- StreamingDataSourceV2Relation [value#1], org.apache.spark.sql.execution.streaming.MemoryStreamScanBuilder@74fd5bf3,
MemoryStream[value#1], 1, 2

== Physical Plan ==
WriteToDataSourceV2 org.apache.spark.sql.execution.streaming.sources.MicroBatchWrite@3de866af
+- *(1) Project [value#1]
   +- *(1) MicroBatchScan[value#1] MemoryStreamDataSource
{code}


> MicroBatchWriter should use StreamWriter for human-friendly textual representation (toString)
> ---------------------------------------------------------------------------------------------
>
>                 Key: SPARK-27977
>                 URL: https://issues.apache.org/jira/browse/SPARK-27977
>             Project: Spark
>          Issue Type: Improvement
>          Components: Structured Streaming
>    Affects Versions: 3.0.0
>            Reporter: Jacek Laskowski
>            Priority: Minor
>
> The following is a extended explain for a streaming query:
> {code}
> == Parsed Logical Plan ==
> WriteToDataSourceV2 org.apache.spark.sql.execution.streaming.sources.MicroBatchWriter@4737caef
> +- Project [value#39 AS value#0]
>    +- Streaming RelationV2 socket[value#39] (Options: [host=localhost,port=8888])
> == Analyzed Logical Plan ==
> WriteToDataSourceV2 org.apache.spark.sql.execution.streaming.sources.MicroBatchWriter@4737caef
> +- Project [value#39 AS value#0]
>    +- Streaming RelationV2 socket[value#39] (Options: [host=localhost,port=8888])
> == Optimized Logical Plan ==
> WriteToDataSourceV2 org.apache.spark.sql.execution.streaming.sources.MicroBatchWriter@4737caef
> +- Streaming RelationV2 socket[value#39] (Options: [host=localhost,port=8888])
> == Physical Plan ==
> WriteToDataSourceV2 org.apache.spark.sql.execution.streaming.sources.MicroBatchWriter@4737caef
> +- *(1) Project [value#39]
>    +- *(1) ScanV2 socket[value#39] (Options: [host=localhost,port=8888])
> {code}
> As you may have noticed, {{WriteToDataSourceV2}} is followed by the internal representation
of {{MicroBatchWriter}} that is a mere adapter for {{StreamWriter}}, e.g. {{ConsoleWriter}}.
> It'd be more debugging-friendly if the plans included whatever {{StreamWriter.toString}}
would (which in case of {{ConsoleWriter}} would be {{ConsoleWriter[numRows=..., truncate=...]}}
which gives more context).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message