lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael McCandless <>
Subject German decompounding/tokenization with Lucene?
Date Fri, 15 Sep 2017 22:57:51 GMT

I need to index documents with German text in Lucene, and I'm wondering how
people have done this in the past?

Lucene already has a DictionaryCompoundWordTokenFilter ... is this what
people use?  Are there good, open-source friendly German dictionaries


Mike McCandless

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message