lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Muir <rcm...@gmail.com>
Subject Re: Customizing Regexp syntax in Lucene
Date Mon, 06 Apr 2015 01:09:46 GMT
On Sun, Apr 5, 2015 at 5:08 PM, code fx9 <codefx9@gmail.com> wrote:
> Hi,
> We are using Lucene indirectly via ElasticSearch. We would like to use RE2
> syntax for running regex queries against Lucene. We are already using RE2
> syntax for other parts of our system, so not ability to use the same syntax
> is a deal-breaker for us.
>
> Recently Google has released a pure Java implementation of this library on
> GitHub. Will it be possible to actually use RE2/J library to run regex
> queries in Lucene? I understand that it might require customizing Lucene
> source code. Can you give me any idea how complex and time consuming such
> endeavor might be.
>
> RE2 Syntax: https://re2.googlecode.com/hg/doc/syntax.html
> RE2/J :https://github.com/google/re2j
>
> Thanks.

The only place in lucene that "knows" about syntax is RegexpQuery. It
only has logic for parsing that syntax into a state machine (Automaton
class), otherwise AutomatonQuery takes care of the execution.

Maybe you could create an Re2Query class that works in a similar way:
e.g. uses RE2/J library to parse the syntax into its state machine
representation and translates that to Automaton representation used by
Lucene.

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message