lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Joel Bernstein (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (SOLR-5244) Exporting Full Sorted Result Sets
Date Mon, 28 Jul 2014 03:05:42 GMT

    [ https://issues.apache.org/jira/browse/SOLR-5244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14075870#comment-14075870
] 

Joel Bernstein edited comment on SOLR-5244 at 7/28/14 3:03 AM:
---------------------------------------------------------------

New patch with all tests passing. Also added syntax error handling.

It lookes like rows=-1 is not the best way to signal the export because it seems to already
be used to signal other behavior. 

So right now the syntax is:
{code}
q=hello&rq={!xport}&wt=xsort&fl=...&sort=...
{code}
In general the use of the RankQuery (rq param) is more intuitive then when a PostFilter was
being used to collect the BitSet.

Happy to try a different syntax though if there are more ideas.



was (Author: joel.bernstein):
New patch with all tests passing. Also added syntax error handling.

It lookes like rows=-1 is not the best way to signal the export because it seems to already
be used to signal other behavior. 

So right now the syntax is:

q=hello&rq={!xport}&wt=xsort&fl=...&sort=...

In general the use of the RankQuery (rq param) is more intuitive then when a PostFilter was
being used to collect the BitSet.

Happy to try a different syntax though if there are more ideas.


> Exporting Full Sorted Result Sets
> ---------------------------------
>
>                 Key: SOLR-5244
>                 URL: https://issues.apache.org/jira/browse/SOLR-5244
>             Project: Solr
>          Issue Type: New Feature
>          Components: search
>    Affects Versions: 5.0
>            Reporter: Joel Bernstein
>            Assignee: Joel Bernstein
>            Priority: Minor
>             Fix For: 5.0, 4.10
>
>         Attachments: 0001-SOLR_5244.patch, SOLR-5244.patch, SOLR-5244.patch, SOLR-5244.patch,
SOLR-5244.patch
>
>
> This ticket allows Solr to export full sorted result sets. The proposed syntax is:
> {code}
> q=*:*&rows=-1&wt=xsort&fl=a,b,c&sort=a desc,b desc
> {code}
> Under the covers, the rows=-1 parameter will signal Solr to use the ExportQParserPlugin
as a RankQuery, which will simply collect a BitSet of the results. The SortingResponseWriter
will sort the results based on the sort criteria and stream the results out.
> This capability will open up Solr for a whole range of uses that were typically done
using aggregation engines like Hadoop. For example:
> *Large Distributed Joins*
> A client outside of Solr calls two different Solr collections and returns the results
sorted by a join key. The client iterates through both streams and performs a merge join.
> *Fully Distributed Field Collapsing/Grouping*
> A client outside of Solr makes individual calls to all the servers in a single collection
and returns results sorted by the collapse key. The client merge joins the sorted lists on
the collapse key to perform the field collapse.
> *High Cardinality Distributed Aggregation*
> A client outside of Solr makes individual calls to all the servers in a single collection
and sorts on a high cardinality field. The client then merge joins the sorted lists to perform
the high cardinality aggregation.
> *Large Scale Time Series Rollups*
> A client outside Solr makes individual calls to all servers in a collection and sorts
on time dimensions. The client merge joins the sorted result sets and rolls up the time dimensions
as it iterates through the data.
> In these scenarios Solr is being used as a distributed sorting engine. Developers can
write clients that take advantage of this sorting capability in any way they wish.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message