flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tzu-Li (Gordon) Tai (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (FLINK-12047) Savepoint connector to read / write / process savepoints
Date Thu, 28 Mar 2019 09:22:00 GMT

     [ https://issues.apache.org/jira/browse/FLINK-12047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Tzu-Li (Gordon) Tai updated FLINK-12047:
----------------------------------------
    Description: 
This JIRA tracks the ongoing efforts and discussions about a means to read / write / process
state in savepoints.

There are already two known existing works (that was mentioned already in the mailing lists)
related to this:
1. Bravo [1]
2. https://github.com/sjwiesman/flink/tree/savepoint-connector

Essentially, the two tools both provide a connector to read or write a Flink savepoint, and
allows to utilize Flink's processing APIs for querying / processing the state in the savepoint.

We should try to converge the efforts on this, and have a savepoint connector like this in
Flink.
With this connector, the high-level benefits users should be able to achieve with it are:
1. Create savepoints using existing data from other systems (i.e. bootstrapping a Flink job's
state with data in an external database).
2. Derive new state using existing state
3. Query state in savepoints, for example for debugging purposes
4. Migrate schema of state in savepoints offline, compared to the current more limited approach
of online migration on state access.
5. Change max parallelism of jobs, or any other kind of fixed configuration, such as operator
uids.

[1] https://github.com/king/bravo


  was:
This JIRA tracks the ongoing efforts and discussions about a means to read / write / process
state in savepoints.

There are already two known existing works (that was mentioned already in the mailing lists)
related to this:
1. Bravo [1]
2. https://github.com/sjwiesman/flink/tree/savepoint-connector

Essentially, the two tools both provide a connector to read or write a Flink savepoint, and
allows to utilize Flink's processing APIs for querying / processing the state in the savepoint.

We should try to converge the efforts on this, and have a savepoint connector like this in
Flink.
With this connector, the high-level benefits users should be able to achieve with it are:
1. Create savepoints using existing data from other systems (i.e. bootstrapping a Flink job's
state with data in an external database).
2. Derive new state using existing state
3. Query state in savepoints, for example for debugging purposes
4. Migrate schema of state in savepoints offline, compared to the current more limited approach
of online migration on state access.
5. Change max parallelism of jobs.

[1] https://github.com/king/bravo



> Savepoint connector to read / write / process savepoints
> --------------------------------------------------------
>
>                 Key: FLINK-12047
>                 URL: https://issues.apache.org/jira/browse/FLINK-12047
>             Project: Flink
>          Issue Type: New Feature
>          Components: Runtime / State Backends
>            Reporter: Tzu-Li (Gordon) Tai
>            Priority: Major
>
> This JIRA tracks the ongoing efforts and discussions about a means to read / write /
process state in savepoints.
> There are already two known existing works (that was mentioned already in the mailing
lists) related to this:
> 1. Bravo [1]
> 2. https://github.com/sjwiesman/flink/tree/savepoint-connector
> Essentially, the two tools both provide a connector to read or write a Flink savepoint,
and allows to utilize Flink's processing APIs for querying / processing the state in the savepoint.
> We should try to converge the efforts on this, and have a savepoint connector like this
in Flink.
> With this connector, the high-level benefits users should be able to achieve with it
are:
> 1. Create savepoints using existing data from other systems (i.e. bootstrapping a Flink
job's state with data in an external database).
> 2. Derive new state using existing state
> 3. Query state in savepoints, for example for debugging purposes
> 4. Migrate schema of state in savepoints offline, compared to the current more limited
approach of online migration on state access.
> 5. Change max parallelism of jobs, or any other kind of fixed configuration, such as
operator uids.
> [1] https://github.com/king/bravo



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message