lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrew Wagner <>
Subject Deciding whether to stem at query time
Date Mon, 23 Apr 2012 19:21:09 GMT
So I just realized the other day that stemming basically happens at index
time. If I'm understanding correctly, there's no way to allow a user to
specify, at run time, whether to stem particular words or not based on a
single index. I think there are two options, but I'd love to hear that I'm

1.) Incrementally build up a white list of words that don't stem very well.
To pick a random example out of the blue, "light" isn't super closely
related to, "lighter", so I might choose not to stem that. If I wanted to
do this, I think (if I understand correctly), stemmerOverrideFilter would
help me out with this. I'm not a big fan of this approach.

2.) Index all the text in two fields, once with stemming and once without.
Then build some kind of option into the UI for specifying whether to stem
the words or not, and search the appropriate field. Unfortunately, this
would roughly double the size of my index, and probably affect query times
too. Plus, the UI would probably suck.

Am I missing an option? Has anyone tried one of these approaches?


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message