Hi Benj,
Testing with the latest code, this error does not show:
0: jdbc:drill:zk=local> select columns[0], columns[1] from
dfs.`/data/bar.csv`;
+----------------------------------------------------------------------------------+---------+
| EXPR$0
| EXPR$1 |
+----------------------------------------------------------------------------------+---------+
| hello | world |
|
a01234567890a01234567890a01234567890a01234567890a01234567890a01234567890a01234567890a01234567890a
...............890
The error you saw is raised in the append() method in
FieldVarCharOutput.java , but somehow when I ran the example above that
code path was not taken.
However indeed the printout truncates (at about 10,000 chars), including
truncating the following columns. The actual varchar retains its length:
0: jdbc:drill:zk=local> select char_length(columns[0]) from
dfs.`/data/bar.csv`;
+---------+
| EXPR$0 |
+---------+
| 5 |
| 72061 |
+---------+
-- Boaz
On 1/23/19 12:29 PM, benj.dev wrote:
> Hi,
>
> With a CSV file test.csv
> col1,col2
> w,x
> ...y...,z
>
> where ...y... is a > 65536 character string (let say 66000 for example)
>
> Error with
> extract of storage : "csv": { "type": "text", "extensions": [ "csv" ],
> "extractHeader": true, "delimiter": "," },
> SELECT * FROM tmp.`test.csv`
> Error: UNSUPPORTED_OPERATION ERROR: Trying to write something big in a
> column
> column name columns
> column index
> Fragment 0:0
>
> Same error with
> SELECT col1, col2 FROM TABLE(tmp.`test.csv`(type => 'text',
> fieldDelimiter => ',', extractHeader => true))
> Error: UNSUPPORTED_OPERATION ERROR: Trying to write something big in a
> column
> columnIndex 0
> Limit 65536
> Fragment 0:0
>
> But it's OK with
> SELECT columns[0], columns[1] FROM TABLE(tmp.`test.csv`(type => 'text',
> fieldDelimiter => ',', extractHeader => false))
> col1 | col2
> +---------+------
> | w | x
> | ...y... | z
>
> Why it's not possible to work with big varchar with extractHeader=true
> event though it work with extractHeader=false ?
>
>
> And in a derivative point for the last case, for printing, ...y... is
> truncated and next fields (here only col2) are not display for this row
> (without any mark/warning) but the real value is correct.
> Maybe it's a bug to not display other columns and maybe the length of
> the printed character should be controlled by an option and finally a
> marker should indicate when it's the case
>
> Thanks for any explanation
|