lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Otis Gospodnetic <otis_gospodne...@yahoo.com>
Subject Re: Data storage, and textual analysis
Date Tue, 19 Jan 2010 20:02:27 GMT
Gora,

What you are seeing are the *stored* values, which are the original, unchanged field values.
Analysis is applied to text for *indexing* purposes.


Otis
--
Sematext -- http://sematext.com/ -- Solr - Lucene - Nutch



----- Original Message ----
> From: Gora Mohanty <gora@srijan.in>
> To: solr-user@lucene.apache.org
> Sent: Tue, January 19, 2010 1:41:05 PM
> Subject: Data storage, and textual analysis
> 
> Hi,
> 
> Another simple query. I have set up a field to hold phonetic
> equivalents, with the relevant part of schema.xml looking like:
> 
> 
> 
> generateWordParts="1" generateNumberParts="0" catenateWords="1"
> catenateNumbers="0" catenateAll="0"/>
> 
> class="com.srijan.search.solr.analysis.AspellFilterFactory"/>
> 
> 
> Here, com.srijan.search.solr.analysis.AspellFilterFactory is
> a custom filter that provides a phonetic soundslike equivalent for
> Indian languages transliterated into English. However, that is
> irrelevant here, as the issue below holds even if I use the standard
> solr.DoubleMetaphoneFilterFactory.
> 
> I have a data source where all text is upper-case, and from
> various Solr-related discussions found through Google, I would have
> thought that fields of this type would be stored as the lower-case,
> soundslike equivalent. Instead the data (as seen through the Solr
> admin. interface, or through a front-end search) seem to be stored
> as is.
> 
> The Solr admin. analysis view does show the index and query
> conversions as I would expect. Also, phonetic matches, and matches
> with lower-case input work properly. I am just curious as to how
> this works.
> 
> Regards,
> Gora


Mime
View raw message