Currently, Drill does not support skipping bad records. It does, however,
pinpoint you where the problem is.
-Hanifi
On Tue, May 5, 2015 at 11:49 PM, fritz wijaya <fritz.wijaya@gmail.com>
wrote:
> I'm recently run drill to explore data from hadoop cluster. But, I have
> problem with when running the query againts our data source. The query
> always failed due to malformed json data. Because, the data itself pretty
> raw, It maybe contains some malformed json format in and there. Its
> difficult to do cleaning itself due to the data size (around hundreds of
> gigs text file).
>
> My question is, Is there anyway to exclude/skip the malformed records, and
> make it into separate result, just like the spark/shark do? How to solve
> this problem elegantly?
>
> Thank you.
>
> Regards,
> Fritz
>
|