flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-3677) FileInputFormat: Allow to specify include/exclude file name patterns
Date Fri, 08 Jul 2016 14:30:11 GMT

    [ https://issues.apache.org/jira/browse/FLINK-3677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15367732#comment-15367732

ASF GitHub Bot commented on FLINK-3677:

Github user kl0u commented on the issue:

    Hi @mushketyk ,
    Sorry for the late reply. 
    I have some general comments on the PR.
    1) Given the FilePathFilter, I think that the GlobFilePathFilter is redundant, right?
It is a specific implementation that uses pattern matching. This can be provided by the programmer.
Given this, in the FileInputFormat, the filesFilter (which we could change the name to filePathFilter
or sth more expressive of its function) becomes a FilePathFilter.
    2) Given that the filter is now in the FileInputFormat, then the ContinuousFileMonitoringFunction
should change, as now it is the format that does the filtering. So the filter should be removed
from its constructor and the shouldIgnore() method becomes redundant.
    3) The affected methods in the StreamExecutionEnvironment should change too.
    It could help if you shared a design draft before integrating the changes, so that we
can discuss on them and figure out all the parts in the code that change and need testing.
    What do you think?

> FileInputFormat: Allow to specify include/exclude file name patterns
> --------------------------------------------------------------------
>                 Key: FLINK-3677
>                 URL: https://issues.apache.org/jira/browse/FLINK-3677
>             Project: Flink
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 1.0.0
>            Reporter: Maximilian Michels
>            Assignee: Ivan Mushketyk
>            Priority: Minor
>              Labels: starter
> It would be nice to be able to specify a regular expression to filter files.

This message was sent by Atlassian JIRA

View raw message