lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dawid Weiss <dawid.we...@gmail.com>
Subject Re: Automata and Transducer on Lucene 6
Date Tue, 18 Apr 2017 18:33:20 GMT
> I'd like to read something written by who designed these classes. What
> motivated, usage examples, what it is good for and what it is not good for.
> Maybe a history of the development of Automata on Lucene

Are you looking for a historical book on Lucene development or are you
looking to solve a particular problem? Perhaps if you explain the problem
then it'll be easier to come up with answers (or hints as to where you may
begin looking for them).

You stated so many questions that it's hard to address them without
spending a few hours just typing...

But just so that it's evident I tried:

- the FST class implements a transducer; check out examples (tests) as they're
the best documentation on how to use these classes; they do use byte[]
underneath;

- Automaton etc. are completely independent and used for slightly different
purposes (it's brics library ported to Lucene). Again -- tests will be
helpful to understand how they work. These classes use object
representation of states and nodes
until you "compile" them into a RunAutomaton which is essentially an
immutable deterministic
automaton compiled into byte[]. It has benefits, but also drawbacks
(you can't change it). Determinization
of arbitrary automata can be costly or prohibitive.

Dawid


On Tue, Apr 18, 2017 at 3:58 PM, Juarez Sampaio
<juarez@simbioseventures.com> wrote:
> Hello everyone,
>
> Recently I've watched a few videos and read a few blog posts on Lucene's
> Automata and how one can speed up things by 100x when properly using
> Automata and Transducers. "I can definitely use a boost like this", right?
> The problem is that this material I've read was writen to Lucene 4 and it
> seems the API has changes a lot since then.
>
> To beggin with, *I can't find transducers* anywhere and I'm missing a few
> Automata construction capabilities such as union (it used to be located on
> the class BasicOperations). I think what I am really missing is an intro to
> Automata classes on Lucene 6. *Can someone point me to a link introducing
> Automata (and possibly Transducers) on Lucene 6?*
>
> So far I've been learning by navigating java docs with ctrl + F, which
> hasn't been productive: It took me a while to figure out I had to use a
> AutomatonRun to check that the automaton accepts a given char sequence. And
> before that I had tried to manually start from node 0 and manually traverse
> the Automata and check for a final state at the end of a String. I'd really
> appreciate some guidance here.
>
> I'd like to read something written by who designed these classes. What
> motivated, usage examples, what it is good for and what it is not good for.
> Maybe a history of the development of Automata on Lucene. Where they built
> for in-memory usage only? Is there a good way to go about serializing it?
> If possible, I'd like some explanation on the mad pointers structure used
> to efficiently implement automata. From the videos I watched I was
> expecting a byte[] implementation, but looking at the code I see a couple
> of int[] used to represent states and transitions. What happened to the
> byte[] implementation of Lucene 4?
> --
> Juarez

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message