flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From 陈梓立 (JIRA) <j...@apache.org>
Subject [jira] [Commented] (FLINK-10038) Parallel the creation of InputSplit if necessary
Date Fri, 03 Aug 2018 05:13:00 GMT

    [ https://issues.apache.org/jira/browse/FLINK-10038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16567780#comment-16567780
] 

陈梓立 commented on FLINK-10038:
-----------------------------

After taking a look of InputSplit and InputFormat, I find it that the interface for the creation
of input splits is InputSplitSource#createInputSplits, whose implementations varies from FileInputFormat
to JDBCInputFormat and so on.

Since we need to decide how to create input split in a specific input source, the parallelize
logic is various inside the implementation, so implement the parallelize logic case by case
if possible and necessary.

What about you guys' opinions? Are there other interfaces we need for the creation of input
splits? What is the most elegant and effective way to do this parallelize and gain benefits
from it you think?

Looking forward to your comments.

> Parallel the creation of InputSplit if necessary
> ------------------------------------------------
>
>                 Key: FLINK-10038
>                 URL: https://issues.apache.org/jira/browse/FLINK-10038
>             Project: Flink
>          Issue Type: Improvement
>    Affects Versions: 1.5.0
>            Reporter: 陈梓立
>            Priority: Major
>
> As a continue to the discussion in the PR about parallelize the creation of ExecutionJobVertex
[here|https://github.com/apache/flink/pull/6353].
> [~StephanEwen] suggested that we could parallelize the creation of InputSplit, from which
we gain performance improvements.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message