Highlighting/excerpt on URLs
-----------------------------
Key: LUCY-199
URL: https://issues.apache.org/jira/browse/LUCY-199
Project: Lucy
Issue Type: Bug
Components: Core
Affects Versions: 0.2.2 (incubating)
Environment: Linux
Reporter: Henry
If I explicitly specify excerpt_length:
my $hl = Lucy::Highlight::Highlighter->new(
searcher => $searcher,
query => $query_compiler,
field => 'site',
excerpt_length => 60,
);
...and the field content is longer than 60, then
$page_highlighter->create_excerpt($hit);
returns '...'.
Content which is short than 60, returns the highlighted excerpt as expected.
If I comment out "excerpt_length => 60," above, then it returns the full
non-truncated excerpt with highlighting as expected.
Some >60char samples which return …/"...", searching for [iol.co.za] or
[news24.com] (brackets are mine):
[www.iol.co.za/tonight/books/what-the-dickens-gets-a-statue-1.1130220]
[http://www.news24.com/News24v2/Travel/Mini_Site/ContentDisplay/n24TravelMiniSiteHome/0,,,00.html]
[www.news24.com/News24v2/Travel/Mini_Site/ContentDisplay/n24TravelMiniSite_TravelClub/0,,,00.html]
The following return double-ellipses ("......" - ……), searching
for [adsl mweb.com]:
[http://www.mweb.co.za/helpcentre/ADSL/ADSLGeneralIdisagreewithyourusagereport.aspx]
[http://www.mweb.co.za/helpcentre/FrequentlyAskedQuestions/MWEBHelpCentreFAQsHowdoI/FAQHowdoIHowdoImigratemyADSL/tabid/661/Default.aspx]
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
|