lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tom Winch (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (SOLR-8234) Federated Search (new) - DJoin
Date Thu, 10 Dec 2015 14:12:11 GMT

     [ https://issues.apache.org/jira/browse/SOLR-8234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Tom Winch updated SOLR-8234:
----------------------------
    Attachment: SOLR-8234.patch

some package renaming

> Federated Search (new) - DJoin
> ------------------------------
>
>                 Key: SOLR-8234
>                 URL: https://issues.apache.org/jira/browse/SOLR-8234
>             Project: Solr
>          Issue Type: New Feature
>            Reporter: Tom Winch
>            Priority: Minor
>              Labels: federated_search
>             Fix For: 4.10.3
>
>         Attachments: SOLR-8234.patch, SOLR-8234.patch, SOLR-8234.patch
>
>
> This issue describes a MergeStrategy implementation (DJoin) to facilitate federated search
- that is, distributed search over documents stored in separated instances of SOLR (for example,
one server per continent), where a single document (identified by an agreed, common unique
id) may be stored in more than one server instance, with (possibly) differing fields and data.
> When the MergeStrategy is used in a request handler (via the included QParser) in combination
with distributed search (shards=), documents having an id that has already been seen are not
discarded (as per the default behaviour) but, instead, are collected and returned as a group
of documents all with the same id taking a single position in the result set (this is implemented
using parent/child documents, with an indicator field in the parent - see example output,
below).
> Documents are sorted in the result set based on the highest ranking document with the
same id. It is possible for a document ranking high in one shard to rank very low on another
shard. As a consequence of this, all shards must be asked to return the fields for every document
id in the result set (not just of those documents they returned), so that all the component
parts of each document in the search result set are returned.
> As usual, search parameters are passed on to each shard. So that the shards do not need
any additional configurations in their definition of the /select request handler, we use the
FilterQParserSearchComponent which is configured to filter out the \{!djoin\} search parameter
- otherwise, the target request handler complains about the missing query parser definition.
See the example config, below.
> This issue combines with others to provide full federated search support. See also SOLR-8235
and SOLR-8236.
> Note that this is part of a new implementation of federated search as opposed to the
older issues SOLR-3799 through SOLR-3805.
> --
> Example request handler configuration:
> {code:xml}
>   <searchComponent name="filter" class="org.apache.solr.search.federated.FilterDJoinQParserSearchComponent"
/>
>   
>   <queryParser name="djoin" class="org.apache.solr.search.federated.DJoinQParserPlugin"
/>
>   <requestHandler name="djoin" class="solr.SearchHandler">
>     <lst name="defaults">
>       <str name="shards">http://shard1/solr/core,http://shard2/solr/core,http://shard3/solr/core</str>
>       <bool name="shards.tolerant">true</bool>
>       <str name="rq">{!djoin}</str>
>     </lst>
>     <arr name="last-components">
>       <str>filter</str>
>     </arr>
>   </requestHandler> 
> {code}
> Example output:
> {code:xml}
> <?xml version="1.0" encoding="UTF-8"?>
> <response>
>   <lst name="responseHeader">
>     <int name="status">0</int>
>     <int name="QTime">33</int>
>     <lst name="params">
>       <str name="q">*:*</str>
>       <str name="shards">http://shard1/solr/core,http://shard2/solr/core,http://shard3/solr/core</str>
>       <str name="shards.tolerant">true</str>
>       <str name="wt">xml</str>
>       <str name="rq">{!djoin}</str>
>       <str name="fl">*,[shard]</str>
>     </lst>
>   </lst>
>   <result name="response" numFound="5" start="0" maxScore="1.0">
>     <doc>
>       <bool name="__merge_parent__">true</bool>
>       <doc>
>         <int name="id">200</int>
>         <int name="value">1973</int>
>         <str name="[shard]">http://shard2/solr/core</str>
>         <long name="_version_">1515645309629235200</long>
>       </doc>
>       <doc>
>         <int name="id">200</int>
>         <int name="value">2015</int>
>         <str name="[shard]">http://shard1/solr/core</str>
>         <long name="_version_">1515645309682712576</long>
>       </doc>
>     </doc>
>     <doc>
>       <bool name="__merge_parent__">true</bool>
>       <doc>
>         <int name="id">100</int>
>         <int name="value">873</int>
>         <str name="[shard]">http://shard1/solr/core</str>
>         <long name="_version_">1515645309629254124</long>
>       </doc>
>       <doc>
>         <int name="id">100</int>
>         <int name="value">2001</int>
>         <str name="[shard]">http://shard3/solr/core</str>
>         <long name="_version_">1515645309682792852</long>
>       </doc>
>     </doc>
>     <doc>
>       <bool name="__merge_parent__">true</bool>
>       <doc>
>         <int name="id">300</int>
>         <int name="value">1492</int>
>         <str name="[shard]">http://shard2/solr/core</str>
>         <long name="_version_">1515645309629251252</long>
>       </doc>
>     </doc>
>   </result>
> </response>
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message