drill-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Carles Tarsà <carles.ta...@reviewpro.com>
Subject reading different content from sequence files
Date Thu, 24 Nov 2016 15:45:11 GMT

I've trying Drill because it looks very promising but I've encountered 
some issues which I couldn't solve. I'm wondering if I'm not configuring 
something properly or if there's some bug.

The first issue is that I when try to read a Sequence file, the content 
that I get it's different from the one on the original file.

$ hadoop fs -text /user/ctarsa/esborram2.seq
16/11/24 16:27:37 WARN util.NativeCodeLoader: Unable to load 
native-hadoop library for your platform... using builtin-java classes 
where applicable
key0       value0
key1       value1
key10      value10
key        {"review":"{"author":"àéïöç"}"}

When I try to read it back from DRILL

0: jdbc:drill:zk=local>  select (convert_from(binary_key,'UTF8')), 
(convert_from(binary_value,'UTF8')) from 
| EXPR$0 | EXPR$1 |
| key0 | value0 |
| key1 | value1 |
| key10 | value10 |
| key | ${"review":"{"author":"àéïöç"}"} |
| key | 

5 rows selected (0.308 seconds)

Notice that there are some extra characters, marked in red. Also notice 
that on the first rows the | don't seam to be aligned.

I've tried it in a Mac machine with the latest Drill (1.8.0) with hadoop 
2.6.0-cdh5.4.4 and also in a Linux box. I've also tried with different 
compressions (No compression, LZO, LZO Block, LZO Record) on the 
sequence file with no success.

Can you please help ?



  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message