lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Raghavendra Prabhu" <>
Subject Re: Contextual suggestions
Date Mon, 03 Apr 2006 05:38:07 GMT
You were righ about in not working on a big corpus

I think there is a limit to the query and it would exceed it on a big corpus

I am myself looking at such a similar thing but going through the basics.


On 4/3/06, karl wettin <> wrote:
> 31 mar 2006 kl. 06.54 skrev karl wettin:
> > I've been working a bit with the spell checker. It does a pretty
> > good job when it comes to finding a smiple typo.
> > I was thinking it would be nice if I could turn "heros light and
> > magic" to "did you mean: heroes of might and magic?".
> >
> > My strategy is to combine Markov, A* and Levenstein.
> > Any comments on this? Questions?
> Nothing? Not even a go-go-go? I would really like to discuss it with
> someone before I spend too much time on it. This is what it is: a
> simple Markov chain is similar to ngrams, but on a word level rather
> than character level. A* is a classic gaming algorithm to find the
> cheapest path in a matrix. I assume you all know Levenstein from
> FuzzyQuery.
> I have been sleeping on this a bit and think it might not work on a
> big corpus. One probably have to limit it to one Markov chain per
> context of some kind. Say category or so.
> Perhaps there is some other forum more focused on text analysis you
> would like to recommend me?
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message