lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erick Erickson <erickerick...@gmail.com>
Subject Re: How to routing document for send to particular shard range
Date Tue, 02 Jan 2018 15:06:58 GMT
bq: Only thing which we can achieve is , documents will be routed
based on the hash values of the field values.

Then you have created your collection with compositeID routing or have
some other misconfiguration. You _must_
create your collection with "router.name=implicit".

Rather than _tell_ us what you're doing, please _show_.
1> the exact command you use to create your collection
2> the results of the collections API CLUSTERSTATUS command:
https://lucene.apache.org/solr/guide/6_6/collections-api.html
3> An example document and where you think it should be routed.
4> Where it actually ends up.

Again, you use the "active" tag as the both value of the route field and
the name of the shard. You can name the shard as you choose of course.

This works as I expect (Solr 6.3)

Create command:
localhost:8983/solr/admin/collections?action=CREATE&name=eoe&router.name=implicit&router.field=rfield&collection.configName=eoe&shards=active,inactive,terminated

rfield definition (not sure whether stored="true" or indexed="true"
are required)
<field name="rfield" type="string" indexed="true" stored="true"
required="true" multiValued="false" />


Query to check whether a doc is on a specific shard, note
&distrib=false specifically restricts query to the core indicated:
http://localhost:8983/solr/eoe_terminated_replica1/query?q=*:*&distrib=false

Example XML docs:
<add>
<doc>
  <field name="id">doc1</field>
  <field name="rfield">active</field>
</doc>

<doc>
  <field name="id">doc2</field>
  <field name="rfield">inactive</field>
</doc>

<doc>
  <field name="id">doc3</field>
  <field name="rfield">terminated</field>
</doc>

</add>


Best,
Erick

On Tue, Jan 2, 2018 at 5:39 AM, Susheel Kumar <susheel2777@gmail.com> wrote:
> Hi Ketan,
>
> I believe you need multiple shard looking the count 800M.  How much will be
> the index size?   Assume it comes out to 400G and assume your VM/machines
> has 64GB and practically you want to fit your index into memory for each
> shard... With that I would create 10shards on 10 machines (40 GB index on
> each with some buffer for growth).  Also utilize _route_ parameter for your
> queries to be faster.
>
> Thnx
>
> On Tue, Jan 2, 2018 at 5:27 AM, hemanth <k.hemanthkumar@gmail.com> wrote:
>
>> Hi Ketan,
>>
>> I also tried various ways to route documents to different shards based on
>> some routing key value. eg:  status: active,inactive and terminated should
>> go to 3 different shards. I tried creating implicit as well as composite id
>> routers. I could not route the documents to the shard I want. Only thing
>> which we can achieve is , documents will be routed based on the hash values
>> of the field values. This will do automatically and it will not help to
>> manually route to the shard we need. The api documents looks little fuzzy
>> and I think solr will not route the documents to the desired shard
>> manually.
>> I am referring 6.6 version. I also tried creating some dummy "_route_"
>> field
>> and copied my status to this field and tried. But no luck. By any chance if
>> you got the solution. Please let me know. I think , it will be one of the
>> important feature , that can be enhanced. Creating different collections ,
>> just for the difference of one field is of not good option. for eg: if we
>> have sales documents, we want to partition them by sales country. i.e USA
>> sales in one shard and Canada sales in one shard etc.. For this case , we
>> need one collection with many shards and each shard should contain the data
>> only to that particular shard.
>>
>> Thanks
>> Hemanth
>>
>>
>>
>>
>> --
>> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
>>

Mime
View raw message