lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hoss Man (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (SOLR-5463) Provide cursor/token based "searchAfter" support that works with arbitrary sorting (ie: "deep paging")
Date Sat, 04 Jan 2014 02:27:51 GMT

     [ https://issues.apache.org/jira/browse/SOLR-5463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Hoss Man updated SOLR-5463:
---------------------------

    Attachment: SOLR-5463.patch

New in this patch...

* user error if trying to use cursor with grouping
** I actualy thought i put this in a while ago, because i couldn't wrap my head arround what
it should _mean_, in a perfect world, to use grouping with a cursor (let alone how to implement
it) but reviewing the tests i realized it wasn't there yet.
* fixed the _last_ remaining nocommit: sortDocSet
** working through the code, i realized we could actually leverage the cached docSets in the
{{useFilterForSortedQuery}} situation -- so I refactored the method a bit to give it more
context (so it could call the {{buildTopDocsCollector}} helper method i added), removed the
restriction on {{useFilterCache}} for sorted doc set when a cursor is used, and randomized
{{useFilterForSortedQuery}} in the cursor test configs
* added simple test combining faceting w/cursor.  this was something i was pretty certain
would work fine, but reviewing the clover coverage reports when running just the cursor tests,
i realized that the {{getDocListAndSet}} paths in SolrIndexSearcher weren't being hit, so
we needed a test to prove it.

There are probably still a lot more permutations of things that could be tested, but i'm felling
really good about the state of this patch -- i think it's ready to commit (to trunk) and let
jenkins churn away at it.

i'll plan on pushing to trunk on monday unless anyone has concerns.

> Provide cursor/token based "searchAfter" support that works with arbitrary sorting (ie:
"deep paging")
> ------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-5463
>                 URL: https://issues.apache.org/jira/browse/SOLR-5463
>             Project: Solr
>          Issue Type: New Feature
>            Reporter: Hoss Man
>            Assignee: Hoss Man
>         Attachments: SOLR-5463.patch, SOLR-5463.patch, SOLR-5463.patch, SOLR-5463.patch,
SOLR-5463.patch, SOLR-5463.patch, SOLR-5463.patch, SOLR-5463__straw_man.patch, SOLR-5463__straw_man.patch,
SOLR-5463__straw_man.patch, SOLR-5463__straw_man.patch, SOLR-5463__straw_man.patch, SOLR-5463__straw_man.patch,
SOLR-5463__straw_man.patch, SOLR-5463__straw_man.patch, SOLR-5463__straw_man.patch, SOLR-5463__straw_man.patch,
SOLR-5463__straw_man__MissingStringLastComparatorSource.patch
>
>
> I'd like to revist a solution to the problem of "deep paging" in Solr, leveraging an
HTTP based API similar to how IndexSearcher.searchAfter works at the lucene level: require
the clients to provide back a token indicating the sort values of the last document seen on
the previous "page".  This is similar to the "cursor" model I've seen in several other REST
APIs that support "pagnation" over a large sets of results (notable the twitter API and it's
"since_id" param) except that we'll want something that works with arbitrary multi-level sort
critera that can be either ascending or descending.
> SOLR-1726 laid some initial ground work here and was commited quite a while ago, but
the key bit of argument parsing to leverage it was commented out due to some problems (see
comments in that issue).  It's also somewhat out of date at this point: at the time it was
commited, IndexSearcher only supported searchAfter for simple scores, not arbitrary field
sorts; and the params added in SOLR-1726 suffer from this limitation as well.
> ---
> I think it would make sense to start fresh with a new issue with a focus on ensuring
that we have deep paging which:
> * supports arbitrary field sorts in addition to sorting by score
> * works in distributed mode



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message