lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "R. Tan" <tanrihae...@gmail.com>
Subject Re: Scoring for specific field queries
Date Fri, 09 Oct 2009 08:02:38 GMT
How do these filters help the autosuggest?
<filter class="solr.PatternReplaceFilterFactory" pattern="^(.{20})(.*)?"
replacement="$1" replace="all" />
<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>



On Fri, Oct 9, 2009 at 3:59 PM, Avlesh Singh <avlesh@gmail.com> wrote:

> >
> > What are the replacements for, the special character and 20 char?
> >
> I had no time to diff between your definitions and mine. Copy-pasting mine
> was easier :)
>
> Also, do you get results such as "XXXX formula"?
> >
> The "autocomplete" field would definitely not match this query, but the
> "tokenized autocomplete" would.
> Give it a shot, it should work as you expect it to.
>
> Cheers
> Avlesh
>
> On Fri, Oct 9, 2009 at 1:25 PM, R. Tan <tanrihaed58@gmail.com> wrote:
>
> > Thanks, I'll give this a go. What are the replacements for, the special
> > character and 20 char? Also, do you get results such as "XXXX formula"?
> >
> > On Fri, Oct 9, 2009 at 3:45 PM, Avlesh Singh <avlesh@gmail.com> wrote:
> >
> > > I have a very similar set-up for my auto-suggest (I am sorry that it
> > can't
> > > be viewed from an external network).
> > > I am sending you my field definitions, please use them and see if it
> > works
> > > out correctly.
> > >
> > > <fieldType name="autocomplete" class="solr.TextField">
> > >     <analyzer type="index">
> > >        <tokenizer class="solr.KeywordTokenizerFactory"/>
> > >        <filter class="solr.LowerCaseFilterFactory" />
> > >         <filter class="solr.PatternReplaceFilterFactory"
> > > pattern="([^a-z0-9])" replacement="" replace="all" />
> > >        <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
> > >        <filter class="solr.EdgeNGramFilterFactory" maxGramSize="100"
> > > minGramSize="1" />
> > >     </analyzer>
> > >    <analyzer type="query">
> > >        <tokenizer class="solr.KeywordTokenizerFactory"/>
> > >        <filter class="solr.LowerCaseFilterFactory" />
> > >         <filter class="solr.PatternReplaceFilterFactory"
> > > pattern="([^a-z0-9])" replacement="" replace="all" />
> > >        <filter class="solr.PatternReplaceFilterFactory"
> > > pattern="^(.{20})(.*)?" replacement="$1" replace="all" />
> > >        <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
> > >    </analyzer>
> > > </fieldType>
> > >
> > > <fieldType name="tokenized_autocomplete" class="solr.TextField">
> > >     <analyzer type="index">
> > >        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
> > >        <filter class="solr.LowerCaseFilterFactory" />
> > >         <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
> > >        <filter class="solr.EdgeNGramFilterFactory" maxGramSize="100"
> > > minGramSize="1" />
> > >     </analyzer>
> > >    <analyzer type="query">
> > >        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
> > >        <filter class="solr.LowerCaseFilterFactory" />
> > >         <filter class="solr.PatternReplaceFilterFactory"
> > > pattern="([^a-z0-9])" replacement="" replace="all" />
> > >        <filter class="solr.PatternReplaceFilterFactory"
> > > pattern="^(.{20})(.*)?" replacement="$1" replace="all" />
> > >        <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
> > >    </analyzer>
> > > </fieldType>
> > >
> > > <field name="suggestion" type="autocomplete" indexed="true"
> > > stored="false"/>
> > > <field name="tokenized_suggestion" type="tokenized_autocomplete"
> > > indexed="true" stored="true"/>
> > >
> > > q=(suggestion:formula^2 tokenized_suggestion:formula)
> > >
> > > Hope this helps.
> > >
> > > Cheers
> > > Avlesh
> > >
> > > On Fri, Oct 9, 2009 at 1:03 PM, R. Tan <tanrihaed58@gmail.com> wrote:
> > >
> > > > Yeah, I do get results. Anything else I missed out?
> > > > I want it to work like this site's auto suggest feature.
> > > >
> > > > http://www.sematext.com/demo/ac/index.html
> > > >
> > > > Try the keyword 'formula'.
> > > >
> > > > Thanks,
> > > > Rih
> > > >
> > > >
> > > > On Fri, Oct 9, 2009 at 3:24 PM, Avlesh Singh <avlesh@gmail.com>
> wrote:
> > > >
> > > > > Can you just do q=autoCompleteHelper2:caf to see you get results?
> > > > >
> > > > > Cheers
> > > > > Avlesh
> > > > >
> > > > > On Fri, Oct 9, 2009 at 12:53 PM, R. Tan <tanrihaed58@gmail.com>
> > wrote:
> > > > >
> > > > > > Yup, it is. Both are copied from another field called name.
> > > > > >
> > > > > > On Fri, Oct 9, 2009 at 3:15 PM, Avlesh Singh <avlesh@gmail.com>
> > > wrote:
> > > > > >
> > > > > > > Lame question, but are you populating data in the
> > > autoCompleteHelper2
> > > > > > > field?
> > > > > > >
> > > > > > > Cheers
> > > > > > > Avlesh
> > > > > > >
> > > > > > > On Fri, Oct 9, 2009 at 12:36 PM, R. Tan <tanrihaed58@gmail.com
> >
> > > > wrote:
> > > > > > >
> > > > > > > > The problem is, I'm getting equal scores for this:
> > > > > > > > Query:
> > > > > > > > q=(autoCompleteHelper2:caf^10.0 autoCompleteHelper:caf)
> > > > > > > >
> > > > > > > > Partial Result:
> > > > > > > >
> > > > > > > > <doc>
> > > > > > > > <float name="score">0.7821733</float>
> > > > > > > > <str name="autoCompleteHelper">Bikes Café</str>
> > > > > > > > </doc>
> > > > > > > >
> > > > > > > > <doc>
> > > > > > > > <float name="score">0.7821733</float>
> > > > > > > > <str name="autoCompleteHelper">Cafe Feliy</str>
> > > > > > > > </doc>
> > > > > > > >
> > > > > > > > I'm using the standard request handler with this.
> > > > > > > >
> > > > > > > > Thanks,
> > > > > > > > Rih
> > > > > > > >
> > > > > > > >
> > > > > > > > On Fri, Oct 9, 2009 at 3:02 PM, R. Tan <
> tanrihaed58@gmail.com>
> > > > > wrote:
> > > > > > > >
> > > > > > > > > Avlesh,
> > > > > > > > > I don't see anything wrong with the data from
analysis.
> > > > > > > > >
> > > > > > > > > KeywordTokenized:
> > > > > > > > >
> > > > > > > > > *term position ** **1** **2** **3** **4** **5**
**6** **7**
> > > **8**
> > > > > > **9**
> > > > > > > > **
> > > > > > > > > 10** **11** **12** **13** **14** **15** **16**
**...*
> > > > > > > > > *term text ** **th** **he** **e ** **c** **ch**
**ha**
> **am**
> > > > > **mp**
> > > > > > > > **pi*
> > > > > > > > > * **io** **on** **the** **he ** **e c** **ch**
**cha**
> **...*
> > > > > > > > > *term type ** **word** **word** **word** **word**
**word**
> > > > **word**
> > > > > > > > **word
> > > > > > > > > ** **word** **word** **word** **word** **word**
**word**
> > > **word**
> > > > > > > > **word**
> > > > > > > > > **word** **...*
> > > > > > > > > *source start,end ** **0,2** **1,3** **2,4**
**3,5**
> **4,6**
> > > > > **5,7**
> > > > > > > > **6,8
> > > > > > > > > ** **7,9** **8,10** **9,11** **10,12** **0,3**
**1,4**
> > **2,5**
> > > > > > **3,6**
> > > > > > > **
> > > > > > > > > ...*
> > > > > > > > >
> > > > > > > > > WhitespaceTokenized:
> > > > > > > > >
> > > > > > > > > *term position ** **1** **2** **3** **4** **5**
**6** **7**
> > > **8**
> > > > > > **9**
> > > > > > > > **
> > > > > > > > > 10** **11** **...*
> > > > > > > > > *term text ** **th** **he** **the** **ch** **ha**
**am**
> > **mp**
> > > > > > **pi**
> > > > > > > **
> > > > > > > > > io** **on** **cha** **...*
> > > > > > > > > *term type ** **word** **word** **word** **word**
**word**
> > > > **word**
> > > > > > > > **word
> > > > > > > > > ** **word** **word** **word** **word** **...*
> > > > > > > > > *source start,end ** **0,2** **1,3** **0,3**
**0,2**
> **1,3**
> > > > > **2,4**
> > > > > > > > **3,5
> > > > > > > > > ** **4,6** **5,7** **6,8** **...*
> > > > > > > > >
> > > > > > > > > Is term position considered during scoring?
> > > > > > > > >
> > > > > > > > > Thanks,
> > > > > > > > > Rih
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Fri, Oct 9, 2009 at 9:40 AM, Avlesh Singh
<
> > avlesh@gmail.com
> > > >
> > > > > > wrote:
> > > > > > > > >
> > > > > > > > >> Use the field analysis tool to see how the
data is being
> > > > analyzed
> > > > > in
> > > > > > > > both
> > > > > > > > >> the fields.
> > > > > > > > >>
> > > > > > > > >> Cheers
> > > > > > > > >> Avlesh
> > > > > > > > >>
> > > > > > > > >> On Fri, Oct 9, 2009 at 12:56 AM, R. Tan <
> > > tanrihaed58@gmail.com>
> > > > > > > wrote:
> > > > > > > > >>
> > > > > > > > >> > Hmm... I don't quite get the desired
results. Those
> > starting
> > > > > with
> > > > > > > > "cha"
> > > > > > > > >> are
> > > > > > > > >> > now randomly ordered. Is there something
wrong with the
> > > > filters
> > > > > I
> > > > > > > > >> applied?
> > > > > > > > >> >
> > > > > > > > >> >
> > > > > > > > >> > On Thu, Oct 8, 2009 at 7:38 PM, Avlesh
Singh <
> > > > avlesh@gmail.com>
> > > > > > > > wrote:
> > > > > > > > >> >
> > > > > > > > >> > > Filters? I did not mean filters
at all.
> > > > > > > > >> > > I am in a mad rush right now, but
on the face of it
> your
> > > > field
> > > > > > > > >> > definitions
> > > > > > > > >> > > look right.
> > > > > > > > >> > >
> > > > > > > > >> > > This is what I asked for -
> > > > > > > > >> > > q=(autoComplete2:cha^10 autoComplete:cha)
> > > > > > > > >> > >
> > > > > > > > >> > > Lemme know if this does not work
for you.
> > > > > > > > >> > >
> > > > > > > > >> > > Cheers
> > > > > > > > >> > > Avlesh
> > > > > > > > >> > >
> > > > > > > > >> > > On Thu, Oct 8, 2009 at 4:58 PM,
R. Tan <
> > > > tanrihaed58@gmail.com
> > > > > >
> > > > > > > > wrote:
> > > > > > > > >> > >
> > > > > > > > >> > > > Hi Avlesh,
> > > > > > > > >> > > >
> > > > > > > > >> > > > I can't seem to get the scores
right.
> > > > > > > > >> > > >
> > > > > > > > >> > > > I now have these types for
the fields I'm targeting,
> > > > > > > > >> > > >
> > > > > > > > >> > > > <fieldType name="autoComplete"
> class="solr.TextField"
> > > > > > > > >> > > > positionIncrementGap="1">
> > > > > > > > >> > > >      <analyzer type="index">
> > > > > > > > >> > > >        <tokenizer
> > > > class="solr.WhitespaceTokenizerFactory"/>
> > > > > > > > >> > > >        <filter class="solr.LowerCaseFilterFactory"
> />
> > > > > > > > >> > > >        <filter class="solr.NGramFilterFactory"
> > > > > minGramSize="1"
> > > > > > > > >> > > > maxGramSize="20"/>
> > > > > > > > >> > > >      </analyzer>
> > > > > > > > >> > > >      <analyzer type="query">
> > > > > > > > >> > > >        <tokenizer
> > > > class="solr.WhitespaceTokenizerFactory"/>
> > > > > > > > >> > > >        <filter class="solr.LowerCaseFilterFactory"
> />
> > > > > > > > >> > > >      </analyzer>
> > > > > > > > >> > > >    </fieldType>
> > > > > > > > >> > > >    <fieldType name="autoComplete2"
> > > class="solr.TextField"
> > > > > > > > >> > > > positionIncrementGap="1">
> > > > > > > > >> > > >      <analyzer type="index">
> > > > > > > > >> > > >        <tokenizer
> > class="solr.KeywordTokenizerFactory"/>
> > > > > > > > >> > > >        <filter class="solr.LowerCaseFilterFactory"
> />
> > > > > > > > >> > > >        <filter class="solr.NGramFilterFactory"
> > > > > minGramSize="1"
> > > > > > > > >> > > > maxGramSize="20"/>
> > > > > > > > >> > > >      </analyzer>
> > > > > > > > >> > > >      <analyzer type="query">
> > > > > > > > >> > > >        <tokenizer
> > class="solr.KeywordTokenizerFactory"/>
> > > > > > > > >> > > >        <filter class="solr.LowerCaseFilterFactory"
> />
> > > > > > > > >> > > >      </analyzer>
> > > > > > > > >> > > >    </fieldType>
> > > > > > > > >> > > >
> > > > > > > > >> > > > My query is this,
> > > > > > > > >> > > >
> > > > > > > > >> > > >
> > > > > > > > >> > >
> > > > > > > > >> >
> > > > > > > > >>
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> q=*:*&fq=autoCompleteHelper:cha+autoCompleteHelper2:cha&qf=autoCompleteHelper^10.0+autoCompleteHelper2^1.0
> > > > > > > > >> > > >
> > > > > > > > >> > > > What should I tweak from the
above config and query?
> > > > > > > > >> > > >
> > > > > > > > >> > > > Thanks,
> > > > > > > > >> > > > Rih
> > > > > > > > >> > > >
> > > > > > > > >> > > >
> > > > > > > > >> > > > On Thu, Oct 8, 2009 at 4:38
PM, R. Tan <
> > > > > tanrihaed58@gmail.com
> > > > > > >
> > > > > > > > >> wrote:
> > > > > > > > >> > > >
> > > > > > > > >> > > > > I will have to pass on
this and try your
> suggestion
> > > > first.
> > > > > > So,
> > > > > > > > how
> > > > > > > > >> > does
> > > > > > > > >> > > > > your suggestion (1 and
2) boost the my startswith
> > > query?
> > > > > Is
> > > > > > it
> > > > > > > > >> > because
> > > > > > > > >> > > of
> > > > > > > > >> > > > > the n-gram filter?
> > > > > > > > >> > > > >
> > > > > > > > >> > > > >
> > > > > > > > >> > > > >
> > > > > > > > >> > > > > On Thu, Oct 8, 2009 at
2:27 PM, Sandeep Tagore <
> > > > > > > > >> > > sandeep.tagore@gmail.com
> > > > > > > > >> > > > >wrote:
> > > > > > > > >> > > > >
> > > > > > > > >> > > > >>
> > > > > > > > >> > > > >> Yes it can be done
but it needs some
> customization.
> > > > > Search
> > > > > > > for
> > > > > > > > >> > custom
> > > > > > > > >> > > > sort
> > > > > > > > >> > > > >> implementations/discussions.
> > > > > > > > >> > > > >> You can check...
> > > > > > > > >> > > > >>
> > > > > > > > >> > > > >>
> > > > > > > > >> > > >
> > > > > > > > >> > >
> > > > > > > > >> >
> > > > > > > > >>
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> http://lucene.apache.org/solr/api/org/apache/solr/schema/RandomSortField.html
> > > > > > > > >> > > > >> .
> > > > > > > > >> > > > >> Let us know if you
have any issues.
> > > > > > > > >> > > > >>
> > > > > > > > >> > > > >> Sandeep
> > > > > > > > >> > > > >>
> > > > > > > > >> > > > >>
> > > > > > > > >> > > > >> R. Tan wrote:
> > > > > > > > >> > > > >> >
> > > > > > > > >> > > > >> > This might work
and I also have a single value
> > > field
> > > > > > which
> > > > > > > > >> makes
> > > > > > > > >> > it
> > > > > > > > >> > > > >> > cleaner.
> > > > > > > > >> > > > >> > Can sort be
customized (with indexOf()) from
> the
> > > solr
> > > > > > > > >> parameters
> > > > > > > > >> > > > alone?
> > > > > > > > >> > > > >> >
> > > > > > > > >> > > > >>
> > > > > > > > >> > > > >> --
> > > > > > > > >> > > > >> View this message
in context:
> > > > > > > > >> > > > >>
> > > > > > > > >> > > >
> > > > > > > > >> > >
> > > > > > > > >> >
> > > > > > > > >>
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> http://www.nabble.com/Scoring-for-specific-field-queries-tp25798390p25799055.html
> > > > > > > > >> > > > >> Sent from the Solr
- User mailing list archive at
> > > > > > Nabble.com.
> > > > > > > > >> > > > >>
> > > > > > > > >> > > > >>
> > > > > > > > >> > > > >
> > > > > > > > >> > > >
> > > > > > > > >> > >
> > > > > > > > >> >
> > > > > > > > >>
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message