lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hoss Man (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (LUCENE-5697) Preview issue
Date Thu, 05 Jun 2014 22:14:02 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-5697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14019350#comment-14019350
] 

Hoss Man commented on LUCENE-5697:
----------------------------------

1) Lucene 3.5 is pretty old.

2) At first glance, it sounds like the problems you are describing could simply be due to
a disconnect between how your searches are executed vs how you are using the highlighter code.

w/o specific example code or a reproducible test case, there's really no way to tell if what
you are describing is a genuine bug or a missunderstanding of the API.

3) there multiple highlighters available, and a *lot* of different ways to configure them,
so even if there is a bug, w/o more specifics there really isn't enough info here to try and
diagnose _where_ the bug is, let alone _what_ the bug is.

---

can you please provide some code (ideally a stand alone JUnit test using the lucene test-framework)
demonstrating the problem?

> Preview issue
> -------------
>
>                 Key: LUCENE-5697
>                 URL: https://issues.apache.org/jira/browse/LUCENE-5697
>             Project: Lucene - Core
>          Issue Type: Bug
>          Components: modules/highlighter
>         Environment: DocFetcher 1.1.11 on Win 7(64) pro
>            Reporter: Martin Schoenmakers
>
> In DocFetcher, which uses Lucene v3.5.0, we stumbled on a bug. The lead of DocFetcher
has investigated and found the problem seems to be in Lucene. I do not know if this bug has
been fixed in a later Lucene version.
> Issue: 
> We use "proximity search": search on multiple words in a directory with about 300 PDF
files.   
> E.g. search for "wordA wordB wordC"~50, i.e. three words within 50 words distance of
each other. The resulting documents are correct. But the highligted text in the document is
often missing. 
> If the words are in the SAME order as in the search AND on the SAME page, then the higlight
works correct. But if the order of the words is different from the search (like "wordA wordC
wordB" OR the words are not on the same page, then that text is not highlighted. 
> As we use the proximity search on multiple words often, it severely degrades the usability.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message