lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mark Fenbers <mark.fenb...@noaa.gov>
Subject logical steps to configuring file-based spell-check
Date Sun, 01 Nov 2015 16:03:25 GMT
Greetings!

I want my spell-checker to be based on a file 
(/usr/share/dict/linux.words should suffice).  Word-breaks features 
would also be a benefit.  I have previously indexed my docs for 
searching with minimal alterations to the baseline Solr configuration.  
My "docs" are user-typed text, typically a paragraph or two.  The Solr 
searching feature works very well with my local customization.  With the 
success of using the search feature, I now move on to adding 
spell-checking capabilities to my project.

Though my archive of docs *does* contain many technical terms and coded 
site identifiers, I prefer not to use the index-based spellcheck at this 
time, because the archive has never been previously spell-checked and 
I'm apprehensive that misspelled words will appear in my suggestions.  
But the index-based spell-checker is the baseline configuration, so I 
need to change that to use file-based spell checking.  Intuitively, this 
seems as simple as commenting out the IndexBasedSpellChecker XML section 
and uncommenting the FileBasedSpellChecker XML section in the 
solrconfig.xml file that I've customized.  But in doing that, I have 
gotten quite bizarre results, and though I've had much help from some 
very smart (and patient) contributors on this forum, I still have never 
gotten spell-checking to work in any meaningful way, even using the 
debugger.

So, my question for now is:

Should setting up a file-based spell checker just a matter of starting 
with the baseline solrconfig.xml and commenting out the Index-based 
spell checker and uncommenting the File-based Spell Checker (and 
changing the SourceLocation value), or am I overlooking too much??  But 
my second question is, which "baseline" solrconfig.xml should I use as a 
starting point, because there are several solrconfig.xml file nested in 
the subfolders when I unzip the tarball?  I'm using 5.3.0 in case that 
matters.

Thanks!
Mark



Mime
  • Unnamed multipart/mixed (inline, None, 0 bytes)
View raw message