lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erick Erickson <erickerick...@gmail.com>
Subject Re: Autocomplete and Sorting on multiple multi-value/single-value fields
Date Sun, 22 Aug 2010 20:36:02 GMT
Could you fill us in a little more on the behavior you're after? Because I'm
having
trouble understanding what "sort across title and multi-valued fields"
means...

If every document has a title, and title is unique, then there's no need to
sort by
anything else. Sub-sorts only make sense if you have duplicate titles. Which
may be the case in your application, of course.....

The fact that the query matches in a field that isn't the sort field is
irrelevant, as
long as the document matched (in whatever field) has a title......

Best
Erick

On Sat, Aug 21, 2010 at 7:27 PM, Neil Lott <neilmatthewlott@yahoo.com>wrote:

> Hi,
>
> I'm wondering if anyone has run across this issue before.  I do understand
> that you cannot sort on a multivalued field -- so I'm looking for
> alternatives
> people have used.
>
> Let's say I have nine fields:
>
>        <field name="title" type="text" indexed="true" stored="true"
> required="true"/>
>        <field name="titleac" type="autocomplete" indexed="true"
> stored="true" omitNorms="true" omitTermFreqAndPositions="true"/>
>        <field name="titlesort" type="alphaOnlySort" indexed="true"
> stored="true"/>
>
>        <field name="cast" type="text" indexed="true" stored="true"
> required="true" multiValued="true"/>
>        <field name="castac" type="autocomplete" indexed="true"
> stored="true" omitNorms="true" omitTermFreqAndPositions="true"
> multiValued="true"/>
>
>        <field name="crew" type="text" indexed="true" stored="true"
> required="true" multiValued="true"/>
>        <field name="crewac" type="autocomplete" indexed="true"
> stored="true" omitNorms="true" omitTermFreqAndPositions="true"
> multiValued="true"/>
>
> The text field type is standard:
>
> <fieldType name="text" class="solr.TextField" positionIncrementGap="100">
>            <analyzer type="index">
>                <tokenizer class="solr.WhitespaceTokenizerFactory"/>
>                <filter class="solr.StopFilterFactory" ignoreCase="true"
> words="stopwords.txt" enablePositionIncrements="true"/>
>                <filter class="solr.WordDelimiterFilterFactory"
> generateWordParts="1" generateNumberParts="1" catenateWords="1"
> catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/>
>                <filter class="solr.LowerCaseFilterFactory"/>
>                <filter class="solr.KeywordMarkerFilterFactory"
> protected="protwords.txt"/>
>                <filter class="solr.PorterStemFilterFactory"/>
>            </analyzer>
>            <analyzer type="query">
>                <tokenizer class="solr.WhitespaceTokenizerFactory"/>
>                <filter class="solr.SynonymFilterFactory"
> synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
>                <filter class="solr.StopFilterFactory" ignoreCase="true"
> words="stopwords.txt" enablePositionIncrements="true"/>
>                <filter class="solr.WordDelimiterFilterFactory"
> generateWordParts="1" generateNumberParts="1" catenateWords="0"
> catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/>
>                <filter class="solr.LowerCaseFilterFactory"/>
>                <filter class="solr.KeywordMarkerFilterFactory"
> protected="protwords.txt"/>
>                <filter class="solr.PorterStemFilterFactory"/>
>            </analyzer>
>        </fieldType>
>
> The autocomplete field type is pretty standard as well:
>
>  <fieldType name="autocomplete1" class="solr.TextField"
> positionIncrementGap="100">
>            <analyzer type="index">
>                <tokenizer class="solr.KeywordTokenizerFactory"/>
>                <filter class="solr.LowerCaseFilterFactory"/>
>                <filter class="solr.TrimFilterFactory"/>
>                <filter class="solr.EdgeNGramFilterFactory" minGramSize="1"
> maxGramSize="100"/>
>            </analyzer>
>            <analyzer type="query">
>                <tokenizer class="solr.KeywordTokenizerFactory"/>
>                <filter class="solr.LowerCaseFilterFactory"/>
>                <filter class="solr.TrimFilterFactory"/>
>            </analyzer>
>        </fieldType>
>
> The sort I need to be case sensitive including punctuation etc, so that
> field type looks like this:
>
>        <fieldType name="alphaOnlySort" class="solr.TextField"
> sortMissingLast="true" omitNorms="true">
>            <analyzer>
>                <tokenizer class="solr.KeywordTokenizerFactory"/>
>                <filter class="solr.TrimFilterFactory"/>
>            </analyzer>
>        </fieldType>
>
> So if I do this:
>
>
> http://localhost:8983/solr/core/select/?q=titleac:dr&version=2.2&start=0&rows=100&indent=on&fl=title&sort=titlesortasc
>
> Everything works and I get a set of autocompleted results starting with
> "dr" in all forms sorted.  Exactly what I want.
>
> The problem is that I also need to do this:
>
>        http://localhost:8983/solr/core/select/?q=(titleac:dr or
> castac:dr)&version=2.2&start=0&rows=100&indent=on&fl=title,cast
>
> (and the results need to be sorted across both the title field or a match
> in the multivalued cast field)
>
> And I also need to do this:
>
>        http://localhost:8983/solr/core/select/?q=(titleac:dr or castac:dr
> or crewac:dr)&version=2.2&start=0&rows=100&indent=on&fl=title,cast,crew
>
> (and the results need to be sorted across both the title field or a match
> in the multivalued cast field or a match in the multivalued crew field)
>
> As you can see I'm trying to autocomplete across multiple fields some of
> which are multi-valued and then sort those results in solr so solr does all
> my paging work.
>
> This way I don't have to load the full results sets into my jvm client and
> then manually sort them each time.
>
> You can also see I'm trying to make it into one query as my assumption is
> that this will take the least amount of time.
>
> Would anyone happen to have suggestions to how I'm approaching this
> problem?
>
> Thanks,
>
> Neil
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message