drill-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nicolas Paris <nipari...@gmail.com>
Subject Re: REGEX search Operator
Date Thu, 04 Feb 2016 19:38:06 GMT
You mean:
userRegex=>javaRegex
"\d" => "\\d"
"\w" => "\\w"
"\n" => "\n"
I can do that thanks to regex I guess.
I will give a try


2016-02-04 19:37 GMT+01:00 John Omernik <john@omernik.com>:

> So my question on the double escape, is there no way to handle that so the
> user can use single escaped regex? I know many folks who use big data
> platform to test large complex regexes for things like security appliances,
> and having to convert the regex seems like a lot of work if you consider
> every user has to do that.  If there was a way to do it in Drill, that
> would save countless people hours and save many mistakes.
>
> On Thu, Feb 4, 2016 at 12:03 PM, Nicolas Paris <niparisco@gmail.com>
> wrote:
>
> > John, Jason,
> >
> > 2016-02-04 18:47 GMT+01:00 John Omernik <john@omernik.com>:
> >
> > > I'd be curios on how you are implemeting the regex... using Java's
> regex
> > > libraries? etc.
> > >
> > ​Yeah, I use
> > java.util.regex
> > ​
> >
> >
> > > I know one thing with Hive that always bothered me was the need to
> double
> > > escape things.
> > >
> > > '\d\d\d\d-\d\d-\d\d'  needed to be '\\d\\d\\d\\d-\\d\\d-\\d\\d' of we
> can
> > > avoid that it would be AWESOME.
> > >
> > ​My guess is this comes from java way to handle strings. All langages I
> > have used need to double escape.​
> >
> >
> > > On Thu, Feb 4, 2016 at 11:37 AM, Jason Altekruse <
> > altekrusejason@gmail.com
> > > >
> > > wrote:
> >
> > ​code is here: https://github.com/parisni/drill-simple-contains
> > It's disturbing how it is simple...
> > ​
> >
> >
> > > > I think you should actually just put the function in
> > > ​​
> > > Drill itself. System
> > > > native functions are implemented in the same interface as UDFs,
> because
> > > our
> > > > mechanism for evaluating them is very efficient (we code generate
> code
> > > > blocks by linking together the bodies of the individual functions to
> > > > evaluate a complete expression).
> > >
> > ​well the folder tree is quite impressive (
> https://github.com/apache/drill
> > ).
> > ​
> >
> > ​what folder is supposed to be "
> > ​
> > Drill itself"
> > ​ ?​
> > ​
> >
> > > > You can open a JIRA, marking it a feature request. You can open a
> poll
> > > > request against the apache github repo, making sure you follow the
> > > standard
> > > > format for your commit message, prefixing with the JIRA number in the
> > > > format
> > > > Example:
> > > > DRILL-XXXX: Feature description
> > > >
> > > > This will automatically link the PR to your JIRA.
> > >
> > ​Ok I will try thanks​
> >
> > ​a lot​
> >
> > > > - Jason
> > > >
> > > > On Thu, Feb 4, 2016 at 8:44 AM, Nicolas Paris <niparisco@gmail.com>
> > > wrote:
> > > >
> > > > > Jason, I have it working,
> > > > >
> > > > > Just tell me the way to proceed to PR.
> > > > > 1. where do I put my maven project ? Witch folder in my drill
> github
> > > > fork?
> > > > > 2. do I need a jira ? how proceed ?
> > > > >
> > > > > For now, I only published it on my github account in a separate
> > project
> > > > >
> > > > > Thanks
> > > > >
> > > > > 2016-02-04 16:52 GMT+01:00 Jason Altekruse <
> altekrusejason@gmail.com
> > >:
> > > > >
> > > > > > Awesome, thanks!
> > > > > >
> > > > > > On Thu, Feb 4, 2016 at 7:44 AM, Nicolas Paris <
> niparisco@gmail.com
> > >
> > > > > wrote:
> > > > > >
> > > > > > > Well I am creating a udf
> > > > > > > good exercise
> > > > > > > I hope a PR soon
> > > > > > >
> > > > > > > 2016-02-04 16:37 GMT+01:00 Jason Altekruse <
> > > altekrusejason@gmail.com
> > > > >:
> > > > > > >
> > > > > > > > I didn't realize that we were lacking this functionality.
As
> > the
> > > > > > > > repeated_contains operator handles wildcards it makes
sense
> to
> > > add
> > > > > > such a
> > > > > > > > function to drill.
> > > > > > > >
> > > > > > > > It should be simple to implement, would someone like
to open
> a
> > > JIRA
> > > > > and
> > > > > > > > submit a PR for this?
> > > > > > > >
> > > > > > > > - Jason
> > > > > > > >
> > > > > > > > On Tue, Feb 2, 2016 at 8:56 AM, John Omernik <
> john@omernik.com
> > >
> > > > > wrote:
> > > > > > > >
> > > > > > > > > I would like to see something like this as well,
even if
> it's
> > > an
> > > > > > > included
> > > > > > > > > UDF like REGEX(field, pattern) using Java's library
for
> regex
> > > > like
> > > > > > Hive
> > > > > > > > > does.  That would be EXTREMELY helpful.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Tue, Feb 2, 2016 at 6:55 AM, Nicolas Paris
<
> > > > niparisco@gmail.com
> > > > > >
> > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > > ANSI SQL doesn't define regex operator.
> > > > > > > > > > > Drill neither.
> > > > > > > > > > >
> > > > > > > > > > ​Drill has SQL functions extension like
> > "REPEATED_CONTAINS"​
> > > > that
> > > > > > > looks
> > > > > > > > > to
> > > > > > > > > > handle regex. regex operator could be replaced
with one
> new
> > > SQL
> > > > > > > > > extension ?
> > > > > > > > > > I guess I could create my own functions
in java, right ?
> > > Maybe
> > > > > push
> > > > > > > it
> > > > > > > > > into
> > > > > > > > > > github then ?
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > > Doesn't it enough 'LIKE' operator?
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > ​Sadly not, I'am looking for complex pattern
matching. ​
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > --
> > > > > > > > > > > Miura, Masahide
> > > > > > > > > > >
> > > > > > > > > > > -----Original Message-----
> > > > > > > > > > > From: Nicolas Paris [mailto:niparisco@gmail.com]
> > > > > > > > > > > Sent: Tuesday, February 02, 2016 9:04
PM
> > > > > > > > > > > To: user@drill.apache.org
> > > > > > > > > > > Subject: REGEX search Operator
> > > > > > > > > > >
> > > > > > > > > > > Hello,
> > > > > > > > > > >
> > > > > > > > > > > I can't find any reference in the documentation
about a
> > > regex
> > > > > > > > operator.
> > > > > > > > > > >
> > > > > > > > > > > I would like to be able to query this
way :
> > > > > > > > > > >
> > > > > > > > > > > SELECT *
> > > > > > > > > > > FROM xxx
> > > > > > > > > > > WHERE  text_field   regexOperator 
  'regex_pattern';
> > > > > > > > > > >
> > > > > > > > > > > Thanks for helping,
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message