lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erick Erickson <erickerick...@gmail.com>
Subject Re: Autocomplete and Sorting on multiple multi-value/single-value fields
Date Sun, 22 Aug 2010 22:11:48 GMT
Hmmm, is it then really acceptable for the document display to look like
title           crew           cast
zzzz
gggg         cvalue
mmmmm                    cast value

?

Presuming that each document has a title, I know of no built-in way to say
"only sort on the title if there was a title match" or, alternatively,
"only sort on fields that match and intermingle all the results".

The multi-valued thing is, I think, a red herring. I don't think you could
do what you want even if there were only a single value for each record
for cast and crew.

Of course I've been wrong many times before....

Best
Erick

On Sun, Aug 22, 2010 at 2:41 PM, Neil Lott <neilmatthewlott@yahoo.com>wrote:

> Hi Eric,
>
> I think this query explains what I'm trying to do to an extent minus the
> sorting:
>
> >       http://localhost:8983/solr/core/select/?q=(titleac:dr or castac:dr
> or crewac:dr)&version=2.2&start=0&rows=100&indent=on&fl=title,cast,crew
>
>
> If I get a match in the title field or the cast field or the crew field I
> want to return the results.  But given that it could match in any of the
> fields
> what I would like to happen is that if let's say I match:
>
> match1:  title:  Dr. Doodle
> match2:  cast: Dreyfus  (no other title or crew match)
> match3: crew: Dram    (no other title or cast match)
>
> I'd like solr to sort my results to look like this as well:
>
> match1:  title:  Dr. Doodle
> match3: crew: Dram
> match2:  cast: Dreyfus
>
> The fields I'm searching on are auto complete fields so I cannot sort by
> them so that's why I have a copy field and have the alphaOnlySort
> field type which allows me to sort on the original field.
>
> The problem is that crew and cast are multi-valued fields and to my
> understanding there is no way to sort on multivalued fields.
>
> Does that help clarify my problem?  I'm sure other people have run into
> this and am curious what their approach was.
>
> Thanks,
>
> Neil
>
> On Aug 22, 2010, at 2:36 PM, Erick Erickson wrote:
>
> > Could you fill us in a little more on the behavior you're after? Because
> I'm
> > having
> > trouble understanding what "sort across title and multi-valued fields"
> > means...
> >
> > If every document has a title, and title is unique, then there's no need
> to
> > sort by
> > anything else. Sub-sorts only make sense if you have duplicate titles.
> Which
> > may be the case in your application, of course.....
> >
> > The fact that the query matches in a field that isn't the sort field is
> > irrelevant, as
> > long as the document matched (in whatever field) has a title......
> >
> > Best
> > Erick
> >
> > On Sat, Aug 21, 2010 at 7:27 PM, Neil Lott <neilmatthewlott@yahoo.com
> >wrote:
> >
> >> Hi,
> >>
> >> I'm wondering if anyone has run across this issue before.  I do
> understand
> >> that you cannot sort on a multivalued field -- so I'm looking for
> >> alternatives
> >> people have used.
> >>
> >> Let's say I have nine fields:
> >>
> >>       <field name="title" type="text" indexed="true" stored="true"
> >> required="true"/>
> >>       <field name="titleac" type="autocomplete" indexed="true"
> >> stored="true" omitNorms="true" omitTermFreqAndPositions="true"/>
> >>       <field name="titlesort" type="alphaOnlySort" indexed="true"
> >> stored="true"/>
> >>
> >>       <field name="cast" type="text" indexed="true" stored="true"
> >> required="true" multiValued="true"/>
> >>       <field name="castac" type="autocomplete" indexed="true"
> >> stored="true" omitNorms="true" omitTermFreqAndPositions="true"
> >> multiValued="true"/>
> >>
> >>       <field name="crew" type="text" indexed="true" stored="true"
> >> required="true" multiValued="true"/>
> >>       <field name="crewac" type="autocomplete" indexed="true"
> >> stored="true" omitNorms="true" omitTermFreqAndPositions="true"
> >> multiValued="true"/>
> >>
> >> The text field type is standard:
> >>
> >> <fieldType name="text" class="solr.TextField"
> positionIncrementGap="100">
> >>           <analyzer type="index">
> >>               <tokenizer class="solr.WhitespaceTokenizerFactory"/>
> >>               <filter class="solr.StopFilterFactory" ignoreCase="true"
> >> words="stopwords.txt" enablePositionIncrements="true"/>
> >>               <filter class="solr.WordDelimiterFilterFactory"
> >> generateWordParts="1" generateNumberParts="1" catenateWords="1"
> >> catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/>
> >>               <filter class="solr.LowerCaseFilterFactory"/>
> >>               <filter class="solr.KeywordMarkerFilterFactory"
> >> protected="protwords.txt"/>
> >>               <filter class="solr.PorterStemFilterFactory"/>
> >>           </analyzer>
> >>           <analyzer type="query">
> >>               <tokenizer class="solr.WhitespaceTokenizerFactory"/>
> >>               <filter class="solr.SynonymFilterFactory"
> >> synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
> >>               <filter class="solr.StopFilterFactory" ignoreCase="true"
> >> words="stopwords.txt" enablePositionIncrements="true"/>
> >>               <filter class="solr.WordDelimiterFilterFactory"
> >> generateWordParts="1" generateNumberParts="1" catenateWords="0"
> >> catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/>
> >>               <filter class="solr.LowerCaseFilterFactory"/>
> >>               <filter class="solr.KeywordMarkerFilterFactory"
> >> protected="protwords.txt"/>
> >>               <filter class="solr.PorterStemFilterFactory"/>
> >>           </analyzer>
> >>       </fieldType>
> >>
> >> The autocomplete field type is pretty standard as well:
> >>
> >> <fieldType name="autocomplete1" class="solr.TextField"
> >> positionIncrementGap="100">
> >>           <analyzer type="index">
> >>               <tokenizer class="solr.KeywordTokenizerFactory"/>
> >>               <filter class="solr.LowerCaseFilterFactory"/>
> >>               <filter class="solr.TrimFilterFactory"/>
> >>               <filter class="solr.EdgeNGramFilterFactory"
> minGramSize="1"
> >> maxGramSize="100"/>
> >>           </analyzer>
> >>           <analyzer type="query">
> >>               <tokenizer class="solr.KeywordTokenizerFactory"/>
> >>               <filter class="solr.LowerCaseFilterFactory"/>
> >>               <filter class="solr.TrimFilterFactory"/>
> >>           </analyzer>
> >>       </fieldType>
> >>
> >> The sort I need to be case sensitive including punctuation etc, so that
> >> field type looks like this:
> >>
> >>       <fieldType name="alphaOnlySort" class="solr.TextField"
> >> sortMissingLast="true" omitNorms="true">
> >>           <analyzer>
> >>               <tokenizer class="solr.KeywordTokenizerFactory"/>
> >>               <filter class="solr.TrimFilterFactory"/>
> >>           </analyzer>
> >>       </fieldType>
> >>
> >> So if I do this:
> >>
> >>
> >>
> http://localhost:8983/solr/core/select/?q=titleac:dr&version=2.2&start=0&rows=100&indent=on&fl=title&sort=titlesortasc
> >>
> >> Everything works and I get a set of autocompleted results starting with
> >> "dr" in all forms sorted.  Exactly what I want.
> >>
> >> The problem is that I also need to do this:
> >>
> >>       http://localhost:8983/solr/core/select/?q=(titleac:dr or
> >> castac:dr)&version=2.2&start=0&rows=100&indent=on&fl=title,cast
> >>
> >> (and the results need to be sorted across both the title field or a
> match
> >> in the multivalued cast field)
> >>
> >> And I also need to do this:
> >>
> >>       http://localhost:8983/solr/core/select/?q=(titleac:dr or
> castac:dr
> >> or crewac:dr)&version=2.2&start=0&rows=100&indent=on&fl=title,cast,crew
> >>
> >> (and the results need to be sorted across both the title field or a
> match
> >> in the multivalued cast field or a match in the multivalued crew field)
> >>
> >> As you can see I'm trying to autocomplete across multiple fields some of
> >> which are multi-valued and then sort those results in solr so solr does
> all
> >> my paging work.
> >>
> >> This way I don't have to load the full results sets into my jvm client
> and
> >> then manually sort them each time.
> >>
> >> You can also see I'm trying to make it into one query as my assumption
> is
> >> that this will take the least amount of time.
> >>
> >> Would anyone happen to have suggestions to how I'm approaching this
> >> problem?
> >>
> >> Thanks,
> >>
> >> Neil
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message