drill-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Hanifi Gunes <hgu...@maprtech.com>
Subject Re: Problem with malformed data from json
Date Wed, 06 May 2015 17:52:00 GMT
Currently, Drill does not support skipping bad records. It does, however,
pinpoint you where the problem is.

-Hanifi

On Tue, May 5, 2015 at 11:49 PM, fritz wijaya <fritz.wijaya@gmail.com>
wrote:

> I'm recently run drill to explore data from hadoop cluster. But, I have
> problem with when running the query againts our data source. The query
> always failed due to malformed json data. Because, the data itself pretty
> raw, It maybe contains some malformed json format in and there. Its
> difficult to do cleaning itself due to the data size (around hundreds of
> gigs text file).
>
> My question is, Is there anyway to exclude/skip the malformed records, and
> make it into separate result, just like the spark/shark do? How to solve
> this problem elegantly?
>
> Thank you.
>
> Regards,
> Fritz
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message