lucy-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Henry (Issue Comment Edited) (JIRA)" <j...@apache.org>
Subject [lucy-issues] [jira] [Issue Comment Edited] (LUCY-199) Highlighting/excerpt on URLs
Date Sat, 31 Dec 2011 14:18:30 GMT

    [ https://issues.apache.org/jira/browse/LUCY-199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13178019#comment-13178019
] 

Henry edited comment on LUCY-199 at 12/31/11 2:18 PM:
------------------------------------------------------

I have attached a test case for this issue as requested by Nick.

To use:
mkdir test
cd test
tar zxf hltest1.tgz

./runme   # to create the test index

To show incorrect highlighting (ie, no highlighting at all) when using 60 char length for
excerpt (two hits):
% ./search mweb
&#8230;
&#8230;&#8230;


Another:
./search adsl
&#8230;

More:
./search  news24.com
&#8230;
&#8230;

Finally:
./search  iol.co.za
&#8230;


If you comment out the length in ./search:

- excerpt_length => 60,
+ #excerpt_length => 60,

...and rerun the ./search tests, you'll get the expected highlighting, but obviously with
an undesirable length.
                
      was (Author: henk):
    I have attached a test case for this issue as requested by Nick.

To use:
<pre>
mkdir test
cd test
tar zxf hltest1.tgz

./runme   # to create the test index
</pre>

To show incorrect highlighting (ie, no highlighting at all) when using 60 char length for
excerpt (two hits):
% ./search mweb
&#8230;
&#8230;&#8230;


Another:
./search adsl
&#8230;

More:
./search  news24.com
&#8230;
&#8230;

Finally:
./search  iol.co.za
&#8230;


If you comment out the length in ./search:

- excerpt_length => 60,
+ #excerpt_length => 60,

...and rerun the ./search tests, you'll get the expected highlighting, but obviously with
an undesirable length.
                  
> Highlighting/excerpt on URLs 
> -----------------------------
>
>                 Key: LUCY-199
>                 URL: https://issues.apache.org/jira/browse/LUCY-199
>             Project: Lucy
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.2.2 (incubating)
>         Environment: Linux
>            Reporter: Henry
>         Attachments: hltest1.tgz
>
>
> If I explicitly specify excerpt_length:
> my $hl             = Lucy::Highlight::Highlighter->new(
>    searcher       => $searcher,
>    query          => $query_compiler,
>    field          => 'site',
>    excerpt_length => 60,
> );
> ...and the field content is longer than 60, then
> $page_highlighter->create_excerpt($hit);
> returns '...'.
> Content which is short than 60, returns the highlighted excerpt as expected.
> If I comment out "excerpt_length => 60," above, then it returns the full
> non-truncated excerpt with highlighting as expected.
> Some >60char samples which return &#8230;/"...", searching for [iol.co.za] or
> [news24.com] (brackets are mine):
> [www.iol.co.za/tonight/books/what-the-dickens-gets-a-statue-1.1130220]
> [http://www.news24.com/News24v2/Travel/Mini_Site/ContentDisplay/n24TravelMiniSiteHome/0,,,00.html]
> [www.news24.com/News24v2/Travel/Mini_Site/ContentDisplay/n24TravelMiniSite_TravelClub/0,,,00.html]
> The following return double-ellipses ("......" - &#8230;&#8230;), searching
> for [adsl mweb.com]:
> [http://www.mweb.co.za/helpcentre/ADSL/ADSLGeneralIdisagreewithyourusagereport.aspx]
> [http://www.mweb.co.za/helpcentre/FrequentlyAskedQuestions/MWEBHelpCentreFAQsHowdoI/FAQHowdoIHowdoImigratemyADSL/tabid/661/Default.aspx]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message