lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erick Erickson <erickerick...@gmail.com>
Subject Re: Indexing fieldvalues with dashes and spaces
Date Thu, 05 Aug 2010 01:26:56 GMT
I suspect you're running afoul of tokenizers and filters. The parts of your
schema
that you published aren't the ones that really count.

What you probably need to look at is the FieldType definitions, i.e. what
analysis is
done for, say, text_ws (see <FieldType... in your schema). There you might
find
things like WordDelimiterFilter with several options. LowerCaseFilter, etc.
Each of these
changes what's placed in your index. Here's a good place to start, although
it's not
exhaustive:

http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters

The general idea here is that the Tokenizers in general break up the
incoming stream according to various rules. The Filters then (potentially)
modify each token in various ways.

Until you have a firm handle on this process, facets are probably a
distraction. You're
better off looking at your index with the admin pages and/or Luke and/or
LukeRequestHandler.

And do be aware that fields you get back from a request (i.e. a search) are
the stored fields,
NOT what's indexed. This may trip you up too...

HTH
Erick

On Wed, Aug 4, 2010 at 5:22 PM, PeterKerk <vetteparty@hotmail.com> wrote:

>
> Well the example you provided is 100% relevant to me :)
>
> I've read the wiki now (SchemaXml,SolrFacetingOverview,Query Syntax,
> SimpleFacetParameters), but still do not have an exact idea of what you
> mean.
>
> My situation:
> a city field is something that I want users to search on via text input, so
> lets say "New Yo" would give the results for "New York".
> But also a facet "Cities" is available in which "New York" is just one of
> the cities that is clickable.
>
> The other facet is "theme", which in my example holds values like
> "Gemeentehuis" and "Strand & Zee", that would not be a thing on which can
> be
> searched via manual input but IS clickable.
>
> If you look at my schema.xml, do you see stuff im doing that is absolutely
> wrong for the purpose described above? Because as far as I can see the
> documents are indexed correctly (BESIDES the spaces in the fieldvalues).
>
> Any help is greatly appreciated! :)
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Indexing-fieldvalues-with-dashes-and-spaces-tp1023699p1023992.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message