spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gabor Somogyi (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SPARK-27975) ConsoleSink should display alias and options for streaming progress
Date Wed, 03 Jul 2019 13:31:00 GMT

    [ https://issues.apache.org/jira/browse/SPARK-27975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16877836#comment-16877836
] 

Gabor Somogyi commented on SPARK-27975:
---------------------------------------

After I've moved console sink from v1 to v2 the progress looks like this:
{code:java}
{
  "id" : "b0a8ffaa-900b-4a8b-8769-c2f43b9cf99b",
  "runId" : "6b4d01ed-7d0f-44de-8997-0b31bf2a106e",
  "name" : null,
  "timestamp" : "2019-07-03T12:55:17.104Z",
  "batchId" : 0,
  "numInputRows" : 1,
  "processedRowsPerSecond" : 0.5115089514066496,
  "durationMs" : {
    "addBatch" : 1486,
    "getBatch" : 3,
    "latestOffset" : 0,
    "queryPlanning" : 335,
    "triggerExecution" : 1955,
    "walCommit" : 69
  },
  "stateOperators" : [ ],
  "sources" : [ {
    "description" : "MemoryStream[value#1]",
    "startOffset" : null,
    "endOffset" : 0,
    "numInputRows" : 1,
    "processedRowsPerSecond" : 0.5115089514066496
  } ],
  "sink" : {
    "description" : "org.apache.spark.sql.execution.streaming.ConsoleTable@373e6cb2",
    "numOutputRows" : 1
  }
}
{code}
I think TextSocketV2 is different from ConsoleTable. In TextSocketV2 the mentioned parameters
are keys for the instance which is not the same for ConsoleTable.
I agree at the moment it's not possible to see the parameters of ConsoleWrite but this case
I suggest to add a log entry (let's consider a sink where maybe 10+ different params are there
for example Kafka).


> ConsoleSink should display alias and options for streaming progress
> -------------------------------------------------------------------
>
>                 Key: SPARK-27975
>                 URL: https://issues.apache.org/jira/browse/SPARK-27975
>             Project: Spark
>          Issue Type: Improvement
>          Components: Structured Streaming
>    Affects Versions: 3.0.0
>            Reporter: Jacek Laskowski
>            Priority: Minor
>
> {{console}} sink shows itself in progress with this internal instance representation
as follows:
> {code:json}
>   "sink" : {
>     "description" : "org.apache.spark.sql.execution.streaming.ConsoleSinkProvider@12fa674a"
>   }
> {code}
> That is not very user-friendly and would be much better for debugging if it included
the alias and options as {{socket}} does:
> {code}
>   "sources" : [ {
>     "description" : "TextSocketV2[host: localhost, port: 8888]",
>     ...
>   } ],
> {code}
> The entire sample progress looks as follows:
> {code}
> 19/06/07 11:47:18 INFO MicroBatchExecution: Streaming query made progress: {
>   "id" : "26bedc9f-076f-4b15-8e17-f09609aaecac",
>   "runId" : "8c365e74-7ac9-4fad-bf1b-397eb086661e",
>   "name" : "socket-console",
>   "timestamp" : "2019-06-07T09:47:18.969Z",
>   "batchId" : 2,
>   "numInputRows" : 0,
>   "inputRowsPerSecond" : 0.0,
>   "durationMs" : {
>     "getEndOffset" : 0,
>     "setOffsetRange" : 0,
>     "triggerExecution" : 0
>   },
>   "stateOperators" : [ ],
>   "sources" : [ {
>     "description" : "TextSocketV2[host: localhost, port: 8888]",
>     "startOffset" : 0,
>     "endOffset" : 0,
>     "numInputRows" : 0,
>     "inputRowsPerSecond" : 0.0
>   } ],
>   "sink" : {
>     "description" : "org.apache.spark.sql.execution.streaming.ConsoleSinkProvider@12fa674a"
>   }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message