spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "L. C. Hsieh (Jira)" <j...@apache.org>
Subject [jira] [Commented] (SPARK-34198) Add RocksDB StateStore as external module
Date Sun, 14 Feb 2021 09:37:00 GMT

    [ https://issues.apache.org/jira/browse/SPARK-34198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17284381#comment-17284381
] 

L. C. Hsieh commented on SPARK-34198:
-------------------------------------

I'd tend to take as the baseline from [https://github.com/qubole/spark-state-store|https://github.com/qubole/spark-state-store,]
as we are experimenting it internally and seems we are not only one using it based on previous
comments, and yea I think it is basically from the previous PR [https://github.com/apache/spark/pull/24922].
It looks newer than the first one and has a better structure.

 

 

> Add RocksDB StateStore as external module
> -----------------------------------------
>
>                 Key: SPARK-34198
>                 URL: https://issues.apache.org/jira/browse/SPARK-34198
>             Project: Spark
>          Issue Type: New Feature
>          Components: Structured Streaming
>    Affects Versions: 3.2.0
>            Reporter: L. C. Hsieh
>            Assignee: L. C. Hsieh
>            Priority: Major
>
> Currently Spark SS only has one built-in StateStore implementation HDFSBackedStateStore.
Actually it uses in-memory map to store state rows. As there are more and more streaming applications,
some of them requires to use large state in stateful operations such as streaming aggregation
and join.
> Several other major streaming frameworks already use RocksDB for state management. So
it is proven to be good choice for large state usage. But Spark SS still lacks of a built-in
state store for the requirement.
> We would like to explore the possibility to add RocksDB-based StateStore into Spark SS.
For the concern about adding RocksDB as a direct dependency, our plan is to add this StateStore
as an external module first.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message