lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Green Geometry <green.geome...@gmail.com>
Subject WordnetSynonymParser 4.4.0 which analyzer
Date Thu, 05 Sep 2013 20:07:45 GMT
Hello all,

I am trying to use WordnetSynonymParser in Lucene 4.4.0 to parse the
WordNet prolog file. I need to pass an analyzer to the WordnetSynonymParser
constructor. If I pass the StandardAnalyzer I get:

Exception in thread "main" java.text.ParseException: Invalid synonym rule
at line 109
at
org.apache.lucene.analysis.synonym.WordnetSynonymParser.add(WordnetSynonymParser.java:75)
 at wordnetdirect404.WordNetDirect404.main(WordNetDirect404.java:43)
Caused by: java.lang.IllegalArgumentException: term: course of action
analyzed to a token with posinc != 1
 at
org.apache.lucene.analysis.synonym.SynonymMap$Builder.analyze(SynonymMap.java:125)
at
org.apache.lucene.analysis.synonym.WordnetSynonymParser.parseSynonym(WordnetSynonymParser.java:92)
 at
org.apache.lucene.analysis.synonym.WordnetSynonymParser.add(WordnetSynonymParser.java:67)
... 1 more

I've googled a bit and this has maybe got something to do with
SynonymFilter and "This token stream cannot properly handle position
increments != 1, ie, you should place this filter before filtering out stop
words."?

If I build my own Analyzer, composed of ClassicTokenizer, ClassicFilter and
LowerCaseFilter and pass this my own Analyzer to WordnetSynonymParser
constructor then it works okay....... except my WordNet enriched Lucene
index built with my WordNet-enabled Analyzer is spewing totally wrong
results...

Which Analyzer should I use to pass to WordnetSynonymParser constructor,
and does someone have a good example of WordnetSynonymParser usage, for
instance to build a Lucene index enriched with WordNet synonyms or a
WordNet expanded query?

Regards,
Grelick

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message