lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gora Mohanty <>
Subject Data storage, and textual analysis
Date Tue, 19 Jan 2010 18:41:05 GMT

Another simple query. I have set up a field to hold phonetic
equivalents, with the relevant part of schema.xml looking like:
 <tokenizer class="solr.WhitespaceTokenizerFactory"/>
 <filter class="solr.WordDelimiterFilterFactory"
 generateWordParts="1" generateNumberParts="0" catenateWords="1"
 catenateNumbers="0" catenateAll="0"/>
 <filter class="solr.LowerCaseFilterFactory"/> <filter

Here, is
a custom filter that provides a phonetic soundslike equivalent for
Indian languages transliterated into English. However, that is
irrelevant here, as the issue below holds even if I use the standard

I have a data source where all text is upper-case, and from
various Solr-related discussions found through Google, I would have
thought that fields of this type would be stored as the lower-case,
soundslike equivalent. Instead the data (as seen through the Solr
admin. interface, or through a front-end search) seem to be stored
as is.

The Solr admin. analysis view does show the index and query
conversions as I would expect. Also, phonetic matches, and matches
with lower-case input work properly. I am just curious as to how
this works.


View raw message