lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David Smiley <david.w.smi...@gmail.com>
Subject Re: Highlighting phone numbers
Date Wed, 18 May 2016 16:57:15 GMT
Perhaps an easy thing to try is see of the FastVectorHighlighter yields any
different results.  There are some nuances to the highlighters -- it might.

Failing that, this likely due to your analysis chain, and where exactly the
offsets point to, which you can see/debug in Solr's analysis screen.  You
might have to develop custom analysis components (e.g. custom TokenFilter)
if the offsets aren't what you want.

Good luck,
~ David

On Wed, May 18, 2016 at 9:07 AM marotosg <marotosg@gmail.com> wrote:

> Hi,
>
> I have a solr multivalued field with a list of phone numbers with many
> different formats. Below field type.
> <fieldType name="phone" class="solr.TextField" positionIncrementGap="100">
>                 <analyzer type="index">
>                         <tokenizer class="solr.KeywordTokenizerFactory" />
>                         <filter class="solr.TrimFilterFactory" />
>                         <filter class="solr.PatternReplaceFilterFactory"
> pattern="([^0-9])"
> replacement="" replace="all"/>
>                         <filter class="solr.NGramFilterFactory"
> minGramSize="5" maxGramSize="30"
> />
>                 </analyzer>
>                 <analyzer type="query">
>                         <tokenizer class="solr.KeywordTokenizerFactory" />
>                         <filter class="solr.TrimFilterFactory" />
>                         <filter class="solr.PatternReplaceFilterFactory"
> pattern="([^0-9])"
> replacement="" replace="all"/>
>                         <filter class="solr.NGramFilterFactory"
> minGramSize="3" maxGramSize="30"
> />
>                 </analyzer>
>                 <similarity
> class="com.spencerstuart.similarities.SpencerStuartNoSimilarity"/>
>         </fieldType>
>
> I have a requirement to highlight the part of the number matched to explain
> to the user why this record is returned.
>
> If I search for "17573062033" I am able to match many results but the
> fullnumber is highlighted.
>
> <lst name="responseHeader">
>   <int name="status">0</int>
>   <int name="QTime">12</int>
>   <lst name="params">
>     <str name="fl">CoreID,PhoneListS</str>
>     <str name="indent">true</str>
>     <str name="q">PhoneListS:17573062033</str>
>     <str name="_">1463576646314</str>
>     <str name="hl.simple.pre"><em></str>
>     <str name="hl.simple.post"></em></str>
>     <str name="hl.fl">PhoneListS</str>
>     <str name="wt">xml</str>
>     <str name="hl">true</str>
>     <str name="rows">3</str>
>   </lst>
> </lst>
> <result name="response" numFound="1715" start="0">
>   <doc>
>     <arr name="PhoneListS">
>       <str>1757.306.2033</str>
>     </arr>
>     <str name="CoreID">10224838</str></doc>
>   <doc>
>     <arr name="PhoneListS">
>       <str>1757.306.2033</str>
>     </arr>
>     <str name="CoreID">10224840</str></doc>
>   <doc>
>     <arr name="PhoneListS">
>       <str>1757.306.2089</str>
>       <str>1757.306.7006</str>
>     </arr>
>     <str name="CoreID">10034811</str></doc>
> </result>
> <lst name="highlighting">
>   <lst name="10224838">
>     <arr name="PhoneListS">
>       <str><em>1757.306.2033</em></str>
>     </arr>
>   </lst>
>   <lst name="10224840">
>     <arr name="PhoneListS">
>       <str><em>1757.306.2033</em></str>
>     </arr>
>   </lst>
>   <lst name="10034811">
>     <arr name="PhoneListS">
>       <str><em>1757.306.2089</em></str>
>     </arr>
>   </lst>
> </lst>
> </response>
>
> Would it be possible to get the piece of information which matches.
> Something like this <em>1757.306</em>.2089
>
> thanks
> Sergio
>
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Highlighting-phone-numbers-tp4277491.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
-- 
Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
LinkedIn: http://linkedin.com/in/davidwsmiley | Book:
http://www.solrenterprisesearchserver.com

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message