lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Avlesh Singh <avl...@gmail.com>
Subject Re: "begins with" searches
Date Wed, 28 Oct 2009 04:29:41 GMT
>
> My next issue relates to how to get the results of the author field come up
> in a search across all fields. For example, a search on author:"Houghton, B"
> (which uses the edgytext) yields 16 documents, but a search on
> all:"Houghton, B" (which doesn't) yields only 9. I thought the solution
> should be <copyfield source="*author_mt" dest="all"/> but that doesn't do
> the trick.
>

Do you have a field called "all"? How is it set up? Can you post the
schema.xml snippet relating to this field here?
<copyField> is supported for a dynamic field source. <copyfield
source="*author_mt" dest="all"/> should work for you as long as you have a
field called "all" defined in your schema. Moreover, for your specific use
case, the "all" field needs to be of type "edgytext".

Cheers
Avlesh

On Wed, Oct 28, 2009 at 9:35 AM, Bernadette Houghton <
bernadette.houghton@deakin.edu.au> wrote:

> Thanks Avlesh. The issue with not doing a phrase query on my "edgytext"
> field was that my parent application was adding an escape character to the
> quotation marks, and I was hoping to fix (or rather, work around) at the
> solr end to save maintenance overhead. But I've done a hack in the parent
> application to remove those escape chars, and all is working well in that
> respect.
>
> My next issue relates to how to get the results of the author field come up
> in a search across all fields. For example, a search on author:"Houghton, B"
> (which uses the edgytext) yields 16 documents, but a search on
> all:"Houghton, B" (which doesn't) yields only 9. I thought the solution
> should be <copyfield source="*author_mt" dest="all"/> but that doesn't do
> the trick.
>
> Thanks!
>
> bern
> -----Original Message-----
> From: Avlesh Singh [mailto:avlesh@gmail.com]
> Sent: Tuesday, 27 October 2009 5:54 PM
> To: solr-user@lucene.apache.org
> Subject: Re: "begins with" searches
>
> You are right about the parsing of query terms without a double quote
> (solrQueryParser's defaultOperator has to be "AND" in your case). For the
> problem at hand, two things -
>
>    1. Do you have any reason for not doing a PhraseQuery (query terms
>    enclosed in double quotes) on your "edgytext" field? If not then you can
>   always enclose your query in double quotes to get expected "begins with"
>   matches.
>    2. You can always "escape" your query string before passing to Solr; and
>    you wouldn't need to pass your query term in double quotes. For exapmle,
>   search for the query string - surname, fre when "escaped" would be
> converted
>   into surname,\+fre thereby asking Solr to treat this as a single query
> term.
>   For more details -
>
> http://lucene.apache.org/java/2_3_2/queryparsersyntax.html#Escaping%20Special%20Characters
> .
>   If you use SolrJ, there is a ClientUtils class somewhere in the package
>   which has helper functions to achieve query escaping.
>
> Cheers
> Avlesh
>
> On Tue, Oct 27, 2009 at 9:22 AM, Bernadette Houghton <
> bernadette.houghton@deakin.edu.au> wrote:
>
> > Thanks for this suggestion (thanks Gerald also: no, we're not using
> > BlackLight-type prefixes).
> >
> > I've set up an edgytext fieldType in schema.xml thus -
> >
> > <fieldType name="edgytext" class="solr.TextField"
> > positionIncrementGap="100">
> >  <analyzer type="index">
> >   <tokenizer class="solr.KeywordTokenizerFactory"/>
> >   <filter class="solr.LowerCaseFilterFactory"/>
> >   <filter class="solr.EdgeNGramFilterFactory" minGramSize="1"
> > maxGramSize="25" />
> >  </analyzer>
> >  <analyzer type="query">
> >   <tokenizer class="solr.KeywordTokenizerFactory"/>
> >   <filter class="solr.LowerCaseFilterFactory"/>
> >  </analyzer>
> > </fieldType>
> >
> > And defined a field name thus -
> >
> > <dynamicField name="*author_mt"  type="edgytext"    indexed="true"
> >  stored="true" multiValued="true"/>
> >
> > The results are mixed -
> >
> > * searches such as "surname, f" and "surname, fre" (with quotations and
> > commas) work well, retrieving "surname, f", "surname, Fred", "surname,
> > Frederick" etc etc
> > * searches such as the above but without quotations don't work too well
> as
> > they get parsed as author_mt:surname + author_mt:firstname, with solr
> > reading the query as "author beginning with surname AND author beginning
> > with firstname", which yields nil results.
> >
> > Is there an analyser that will strip the whitespace out altogether? Or
> > another alternative?
> >
> > bern
> >
> > -----Original Message-----
> > From: Avlesh Singh [mailto:avlesh@gmail.com]
> > Sent: Monday, 26 October 2009 6:32 PM
> > To: solr-user@lucene.apache.org
> > Subject: Re: "begins with" searches
> >
> > Read up of setting-up these kind searches here -
> >
> >
> http://www.lucidimagination.com/blog/2009/09/08/auto-suggest-from-popular-queries-using-edgengrams/
> >
> > Cheers
> > Avlesh
> >
> > On Mon, Oct 26, 2009 at 7:43 AM, Bernadette Houghton <
> > bernadette.houghton@deakin.edu.au> wrote:
> >
> > > We need to offer "begins with" type searches, e.g. a search for
> "surname,
> > > f" will retrieve "surname, firstname", "surname, f", "surname fm" etc.
> > >
> > > Ideally, the user would be able to enter something like "surname f*".
> > >
> > > However, wildcards don't work on phrase searches, nor do range
> searches.
> > >
> > > Any suggestions as to how best to search for "begins with" phrases; or,
> > how
> > > to best configure solr to support such searches?
> > >
> > > TIA
> > > Bernadette Houghton, Library Business Applications Developer
> > > Deakin University Geelong Victoria 3217 Australia.
> > > Phone: 03 5227 8230 International: +61 3 5227 8230
> > > Fax: 03 5227 8000 International: +61 3 5227 8000
> > > MSN: bern_houghton@hotmail.com
> > > Email: bernadette.houghton@deakin.edu.au<mailto:
> > > bernadette.houghton@deakin.edu.au>
> > > Website: http://www.deakin.edu.au
> > > <http://www.deakin.edu.au/>Deakin University CRICOS Provider Code
> 00113B
> > > (Vic)
> > >
> > > Important Notice: The contents of this email are intended solely for
> the
> > > named addressee and are confidential; any unauthorised use,
> reproduction
> > or
> > > storage of the contents is expressly prohibited. If you have received
> > this
> > > email in error, please delete it and any attachments immediately and
> > advise
> > > the sender by return email or telephone.
> > > Deakin University does not warrant that this email and any attachments
> > are
> > > error or virus free
> > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message