drill-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From fritz wijaya <fritz.wij...@gmail.com>
Subject Problem with malformed data from json
Date Wed, 06 May 2015 06:49:30 GMT
I'm recently run drill to explore data from hadoop cluster. But, I have
problem with when running the query againts our data source. The query
always failed due to malformed json data. Because, the data itself pretty
raw, It maybe contains some malformed json format in and there. Its
difficult to do cleaning itself due to the data size (around hundreds of
gigs text file).

My question is, Is there anyway to exclude/skip the malformed records, and
make it into separate result, just like the spark/shark do? How to solve
this problem elegantly?

Thank you.


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message