spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tathagata Das (JIRA)" <j...@apache.org>
Subject [jira] [Created] (SPARK-22187) Update unsaferow format for saved state such that we can set timeouts when state is null
Date Tue, 03 Oct 2017 02:56:00 GMT
Tathagata Das created SPARK-22187:
-------------------------------------

             Summary: Update unsaferow format for saved state such that we can set timeouts
when state is null
                 Key: SPARK-22187
                 URL: https://issues.apache.org/jira/browse/SPARK-22187
             Project: Spark
          Issue Type: Sub-task
          Components: Structured Streaming
    Affects Versions: 2.2.0
            Reporter: Tathagata Das


Currently the group state of user-defined-type is encoded as top-level columns in the unsaferows
stores in state store. The timeout timestamp is also saved as (when needed) as the last top-level
column. Since, the groupState is serialized to top level columns, you cannot save "null" as
a value of state (setting null in all the top-level columns is not equivalent). So we dont
let the user to set the timeout without initializing the state for a key. Based on user experience,
his leads to confusion. 

This JIRA is to change the row format such that the state is saved as nested columns. This
would allow the state to be set to null, and avoid these confusing corner cases.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message