flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From GitBox <...@apache.org>
Subject [GitHub] bowenli86 opened a new pull request #6979: [FLINK-10168] Add FileFilter interface and FileModTimeFilter which sets a read start position for files by modification time
Date Thu, 01 Nov 2018 03:29:38 GMT
bowenli86 opened a new pull request #6979: [FLINK-10168] Add FileFilter interface and FileModTimeFilter
which sets a read start position for files by modification time
URL: https://github.com/apache/flink/pull/6979
 
 
   ## What is the purpose of the change
   
   The motivation is 
   
   1. enabling users to set a read start position for files, so they can process files that
are modified after a given timestamp
   2. exposing more file information to users and providing them with a more flexible file
filter interface to define their own filtering rules
   
   ## Brief change log
   
   - add `FileFilter` interface that users can access all available information of a file
and set filtering rules
   - allow users to set `FileFilter` to `FileInputFormat`
   - add `FileModTimeFilter`, in which users can set a read start position for files so Flink
only process files that are modified after the given timestamp
   
   ## Verifying this change
   
   This change added tests and can be verified as follows:
   
     - extended unit tests for FileInputFormat in `FileInputFormatTest`
     - added `FileModTimeFilterTest`
   
   ## Does this pull request potentially affect one of the following parts:
   
     - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (yes
)
   
   ## Documentation
   
     - Does this pull request introduce a new feature? (yes )
     - If yes, how is the feature documented? - Documentation will be added in another PR
in a different jira ticket
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

Mime
View raw message