maven-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jesse Glick (JIRA)" <j...@codehaus.org>
Subject [jira] Commented: (MINDEXER-14) FlatSearchResponse.totalHits = 1000 when there are in fact more
Date Fri, 25 Mar 2011 17:52:22 GMT

    [ http://jira.codehaus.org/browse/MINDEXER-14?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=261497#action_261497
] 

Jesse Glick commented on MINDEXER-14:
-------------------------------------

The use case here is finding a sorted list of all {[artifactId}}s for a given {{groupId}},
where the group in question in fact has hundreds of artifacts, each in several versions (so
a few thousand {{ArtifactInfo}}s in total). Rewriting the client code to use an iterator would
reduce temporary memory consumption - a helpful optimization if the data set becomes much
bigger than it is now - but not permit incremental results to be returned.

What drove me to file this was the fact that the API exposed by the Indexer leads you to write
what looks like a straightforward search that in fact works fine when tested on a typical
sample - but then produces incorrect results on a bigger data set (in my case showing only
the lexicographically last 79 out of 685 artifacts in a group) without throwing an exception
or otherwise returning an obvious error.

> FlatSearchResponse.totalHits = 1000 when there are in fact more
> ---------------------------------------------------------------
>
>                 Key: MINDEXER-14
>                 URL: http://jira.codehaus.org/browse/MINDEXER-14
>             Project: Maven Indexer
>          Issue Type: Bug
>    Affects Versions: 4.0.0
>         Environment: Ubuntu, JDK 6; cause of: https://netbeans.org/bugzilla/show_bug.cgi?id=197036
>            Reporter: Jesse Glick
>
> I am running {{SearchEngine.searchFlatPaged}}. When there happen to be more than 1000
hits in the result, it silently returns just 1000 instead. Surprising behavior since I did
not specify any hit limit. But this is {{AbstractSearchRequest.UNDEFINED_HIT_LIMIT}}, OK.
> Where it gets weirder is that if you set {{resultHitLimit}} to {{UNDEFINED_HIT_LIMIT}},
you still get 1000 results, contradicting the apparent meaning of "undefined". Further, if
you set it to 999 or 1001, and there are a few thousand results, you get an empty result and
{{totalHits}} of -1 or {{AbstractSearchResponse.LIMIT_EXCEEDED}} (which by the way looks like
a constant but is not final!), which is completely different than the behavior for 1000.
> And passing in {{Integer.MAX_VALUE}} to begin with does not work, since then Lucene gets
an {{OutOfMemoryError}} trying to allocate a ridiculously large array or similar.
> Expected behavior: by default, on an otherwise unconfigured search request, the indexer
would return all the hits, however many that is (allocating only a proportional amount of
memory). If I set {{resultHitLimit}} to some value, then that will be used - I will either
get a complete set of results, or {{LIMIT_EXCEEDED}}.
> Workaround: set {{resultHitLimit}} to 1001, then go into a loop retrying the search;
if -1 returned for {{totalHits}}, double the {{resultHitLimit}} and try again.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://jira.codehaus.org/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message