drill-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From John Omernik <j...@omernik.com>
Subject Dealing with bad data when trying to do date computations
Date Tue, 28 Feb 2017 16:03:09 GMT
I have a data set that has birthdays in YYYY-MM-DD format.

Most of this data is great. I am trying to compute the age using

EXTRACT(year from age(dob))


But some of my data is crapola... let's call it alternative data...


When I try to run the Extract function, I get

Error: SYSTEM ERROR: IllegalFieldValueException: Value 0 for monthOfYear
must be in the range [1,12]

Fragment 5:17

[Error Id: 62f90784-c9f4-4362-9710-a37464fc801a on drillnode:20005]


I've tried an ugly where clause, and this works:

where

(dob LIKE '%-01-%' or dob LIKE '%-02-%' or dob LIKE '%-03-%' or dob LIKE
'%-04-%' or dob LIKE '%-05-%' or dob LIKE '%-06-%' or dob LIKE '%-07-%' or
dob LIKE '%-08-%' or dob LIKE '%-09-%' or

dob LIKE '%-1-%' or dob LIKE '%-2-%' or dob LIKE '%-3-%' or dob LIKE
'%-4-%' or dob LIKE '%-5-%' or dob LIKE '%-6-%' or dob LIKE '%-7-%' or dob
LIKE '%-8-%' or dob LIKE '%-9-%' or

dob LIKE '%-10-%' or dob LIKE '%-11-%' or dob LIKE '%-12-%')


But WOW is that ugly. I could add the jar for regex contains, and make it
much easier (do we have a regex search function built into drill? I think
we should at this point...)


Is there another way to say try the extra function, and catch a failure,
 and ignore on failure? What if we had a cast function that returned NULL
on failure so we could use it in the where clause?  Any other more elegant
ways to handle this?


Thanks!


John

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message