drill-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Charles Givre <cgi...@gmail.com>
Subject Re: Dealing with bad data when trying to do date computations
Date Tue, 28 Feb 2017 16:15:10 GMT
Hi John,
I believe that Drill 1.9 includes a REGEXP_MATCHES( <source>, <pattern> )
function which does what you'd expect it to.  I'm not sure when this was
introduced, so it maybe in earlier versions of Drill.
Best,
-- C

On Tue, Feb 28, 2017 at 11:03 AM, John Omernik <john@omernik.com> wrote:

> I have a data set that has birthdays in YYYY-MM-DD format.
>
> Most of this data is great. I am trying to compute the age using
>
> EXTRACT(year from age(dob))
>
>
> But some of my data is crapola... let's call it alternative data...
>
>
> When I try to run the Extract function, I get
>
> Error: SYSTEM ERROR: IllegalFieldValueException: Value 0 for monthOfYear
> must be in the range [1,12]
>
> Fragment 5:17
>
> [Error Id: 62f90784-c9f4-4362-9710-a37464fc801a on drillnode:20005]
>
>
> I've tried an ugly where clause, and this works:
>
> where
>
> (dob LIKE '%-01-%' or dob LIKE '%-02-%' or dob LIKE '%-03-%' or dob LIKE
> '%-04-%' or dob LIKE '%-05-%' or dob LIKE '%-06-%' or dob LIKE '%-07-%' or
> dob LIKE '%-08-%' or dob LIKE '%-09-%' or
>
> dob LIKE '%-1-%' or dob LIKE '%-2-%' or dob LIKE '%-3-%' or dob LIKE
> '%-4-%' or dob LIKE '%-5-%' or dob LIKE '%-6-%' or dob LIKE '%-7-%' or dob
> LIKE '%-8-%' or dob LIKE '%-9-%' or
>
> dob LIKE '%-10-%' or dob LIKE '%-11-%' or dob LIKE '%-12-%')
>
>
> But WOW is that ugly. I could add the jar for regex contains, and make it
> much easier (do we have a regex search function built into drill? I think
> we should at this point...)
>
>
> Is there another way to say try the extra function, and catch a failure,
>  and ignore on failure? What if we had a cast function that returned NULL
> on failure so we could use it in the where clause?  Any other more elegant
> ways to handle this?
>
>
> Thanks!
>
>
> John
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message