lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erick Erickson <>
Subject Re: Applying Tokenizers and Filters to CopyFields
Date Wed, 25 Mar 2015 22:43:10 GMT
Perhaps this would help

indexed=true, stored=true
field can be searched. The raw input (not analyzed in any way) can be
shown to the user in the results list.

indexed=true, stored=false
field can be searched. However, the field can't be returned in the
results list with the document.

indexed=false, stored=true
The field cannot be searched, but the contents can be returned in the
results list with the document. There are some use-cases where this is
desirable behavior.

indexed=false, stored=false
The entire field is thrown out, it's just as if you didn't send the
field to be indexed at all.

And one other thing, the copyField gets the _raw_ data not the
analyzed data. Let's say you have two fields, "src" and "dst".
copying from src to dest in schema.xml is identical to
    <field name=src>original text</field>
   <field name=dst>original text</field>

that is, copyfield directives are not chained.

Also, watch out for your query syntax. Michael's comments are spot-on,
I'd just add this:


is kind of odd. Let's assume you mean "qf" rather than "fq". That
_only_ matters if your query parser is "edismax", it'll be ignored in
this case I believe.

You'd want something like
or even

where "df" is "default field" and the search is applied against that
field in the absence of a field qualification like my first two


On Wed, Mar 25, 2015 at 2:52 PM, Michael Della Bitta
<> wrote:
> I agree the terminology is possibly a little confusing.
> Stored refers to values that are stored verbatim. You can retrieve them
> verbatim. Analysis does not affect stored values.
> Indexed values are tokenized/transformed and stored inverted. You can't
> recover the literal analyzed version (at least, not easily).
> If what you really want is to store and retrieve case folded versions of
> your data as well as the original, you need to use something like a
> UpdateRequestProcessor, which I personally am less familiar with.
> On Wed, Mar 25, 2015 at 5:28 PM, Martin Wunderlich <>
> wrote:
>> So, the pre-processing steps are applied under <analyzer type=„index“>.
>> And this point is not quite clear to me: Assuming that I have a simple
>> case-folding step applied to the target of the copyField: How or where are
>> the lower-case tokens stored, if the text isn’t added to the index? How is
>> the query supposed to retrieve the lower-case version?
>> (sorry, if this sounds like a naive question, but I have a feeling that I
>> am missing something really basic here).
> Michael Della Bitta
> Senior Software Engineer
> o: +1 646 532 3062
> appinions inc.
> “The Science of Influence Marketing”
> 18 East 41st Street
> New York, NY 10017
> t: @appinions <> | g+:
> <>
> w: <>

View raw message