flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ruidong Li (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (FLINK-12405) Introduce BLOCKING_PERSISTENT result partition type
Date Tue, 14 May 2019 02:51:00 GMT

     [ https://issues.apache.org/jira/browse/FLINK-12405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Ruidong Li updated FLINK-12405:
-------------------------------
    Description: 
The new {{ResultPartitionType}} : {{BLOCKING_PERSISTENT}} is similar to {{BLOCKING}} except
it might be consumed for several times and will be released after TM shutdown or {{ResultPartition}}
removal request.
 This is the basis for Interactive Programming.


 Here is the brief changes:

* Introduce {{ResultPartitionType}} called {{BLOCKING_PERSISTENT}}
* Introduce {{BlockingShuffleOutputFormat}} and {{BlockingShuffleDataSink}} which are used
to generate {{GenericDataSinkBase}} with user specified {{IntermediateDataSetID}} (passed
from TableAPI in later PR)
* when {{JobGraphGenerator}} sees a {{GenericDataSinkBase}} with {{IntermediateDataSetID}},
it creates a {{IntermediateDataSet}} with this id, then {{OutputFormatVertex}} for this {{GenericDataSinkBase}}
will be excluded in {{JobGraph}}
* So the JobGraph may contains some JobVertex which has more {{IntermediateDataSet}} than
its downstream consumers.

Here are some design notes:
 * Why modify {{DataSet}} and {{JobGraphGenerator}}
 Since Blink Planner is not ready yet, and Batch Table is running on Flink Planner(based on
DataSet).
 There will be another implementation once Blink Planner is ready.
 * Why use a special {{OutputFormat}} as placeholder
 We could add a {{cache()}} method for DataSet, but we do not want to change DataSet API any
more. so a special {{OutputFormat}} as placeholder seems reasonable.

  was:
The new {{ResultPartitionType}} : {{BLOCKING_PERSISTENT}} is similar to {{BLOCKING}} except
it might be consumed for several times and will be released after TM shutdown or {{ResultPartition}}
removal request.
This is the basis for Interactive Programming.


> Introduce BLOCKING_PERSISTENT result partition type
> ---------------------------------------------------
>
>                 Key: FLINK-12405
>                 URL: https://issues.apache.org/jira/browse/FLINK-12405
>             Project: Flink
>          Issue Type: Sub-task
>          Components: API / DataSet
>            Reporter: Ruidong Li
>            Assignee: Ruidong Li
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> The new {{ResultPartitionType}} : {{BLOCKING_PERSISTENT}} is similar to {{BLOCKING}}
except it might be consumed for several times and will be released after TM shutdown or {{ResultPartition}}
removal request.
>  This is the basis for Interactive Programming.
>  Here is the brief changes:
> * Introduce {{ResultPartitionType}} called {{BLOCKING_PERSISTENT}}
> * Introduce {{BlockingShuffleOutputFormat}} and {{BlockingShuffleDataSink}} which are
used to generate {{GenericDataSinkBase}} with user specified {{IntermediateDataSetID}} (passed
from TableAPI in later PR)
> * when {{JobGraphGenerator}} sees a {{GenericDataSinkBase}} with {{IntermediateDataSetID}},
it creates a {{IntermediateDataSet}} with this id, then {{OutputFormatVertex}} for this {{GenericDataSinkBase}}
will be excluded in {{JobGraph}}
> * So the JobGraph may contains some JobVertex which has more {{IntermediateDataSet}}
than its downstream consumers.
> Here are some design notes:
>  * Why modify {{DataSet}} and {{JobGraphGenerator}}
>  Since Blink Planner is not ready yet, and Batch Table is running on Flink Planner(based
on DataSet).
>  There will be another implementation once Blink Planner is ready.
>  * Why use a special {{OutputFormat}} as placeholder
>  We could add a {{cache()}} method for DataSet, but we do not want to change DataSet
API any more. so a special {{OutputFormat}} as placeholder seems reasonable.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message