flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chesnay Schepler (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-2435) Add support for custom CSV field parsers
Date Tue, 24 Nov 2015 18:48:11 GMT

    [ https://issues.apache.org/jira/browse/FLINK-2435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15025087#comment-15025087

Chesnay Schepler commented on FLINK-2435:

Where would be the best way to safe the custom parsers? the most straightforward way would
be adding them to the static HashMap in FieldParser.java . This approach worked in tests,
but i reckon that wouldn't work on a cluster since the now-modified HashMap wouldn't be shipped
along, and the InputFormats don't touch the parsers before that happens.

> Add support for custom CSV field parsers
> ----------------------------------------
>                 Key: FLINK-2435
>                 URL: https://issues.apache.org/jira/browse/FLINK-2435
>             Project: Flink
>          Issue Type: New Feature
>          Components: Java API, Scala API
>    Affects Versions: 0.10.0
>            Reporter: Fabian Hueske
>             Fix For: 1.0.0
> The {{CSVInputFormats}} have only {{FieldParsers}} for Java's primitive types (byte,
short, int, long, float, double, boolean, String).
> It would be good to add support for CSV field parsers for custom data types which can
be registered in a {{CSVReader}}. 
> We could offer two interfaces for field parsers.
> 1. The regular low-level {{FieldParser}} which operates on a byte array and offsets.
> 2. A {{StringFieldParser}} which operates on a String that has been extracted by a {{StringParser}}
before. This interface will be easier to implement but less efficient.

This message was sent by Atlassian JIRA

View raw message