drill-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Neal McBurnett (JIRA)" <j...@apache.org>
Subject [jira] [Created] (DRILL-3178) csv reader should allow newlines inside quotes
Date Sat, 23 May 2015 04:45:17 GMT
Neal McBurnett created DRILL-3178:

             Summary: csv reader should allow newlines inside quotes 
                 Key: DRILL-3178
                 URL: https://issues.apache.org/jira/browse/DRILL-3178
             Project: Apache Drill
          Issue Type: Bug
    Affects Versions: 1.0.0
         Environment: Ubuntu Trusty 14.04.2 LTS

            Reporter: Neal McBurnett

When reading a csv file which contains newlines within quoted strings, e.g. via

    select * from dfs.`/tmp/q.csv`;

Drill 1.0 says:

    Error: SYSTEM ERROR: com.univocity.parsers.common.TextParsingException:  Error processing
input: Cannot use newline character within quoted string

But many tools produce csv files with newlines in quoted strings.  Drill should be able to
handle them.

Workaround: the csvquote program (https://github.com/dbro/csvquote) can encode embedded commas
and newlines, and even decode them later if desired.

This message was sent by Atlassian JIRA

View raw message