lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shay Hummel <>
Subject Text dependent analyzer
Date Tue, 14 Apr 2015 17:12:21 GMT
I would like to create a text dependent analyzer.
That is, *given a string*, the analyzer will:
1. Read the entire text and break it into sentences.
2. Each sentence will then be tokenized, possesive removal, lowercased,
mark terms and stemmed.

The second part is essentially what happens in english analyzer
(createComponent). However, this is not dependent of the text it receives -
which is the first part of what I am trying to do.

So ... How can it be achieved?

Thank you,

Shay Hummel

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message