lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joel Bernstein <joels...@gmail.com>
Subject Re: retrieving large number of docs
Date Wed, 03 Jun 2015 17:43:52 GMT
A few questions for you:

How large can the list of filtering ID's be?

What's your expectation on latency?

What version of Solr are you using?

SolrCloud or not?

Joel Bernstein
http://joelsolr.blogspot.com/

On Wed, Jun 3, 2015 at 1:23 PM, Robust Links <peyman@robustlinks.com> wrote:

> Hi
>
> I have a set of document IDs from one core and i want to query another core
> using the ids retrieved from the first core...the constraint is that the
> size of doc ID set can be very large. I want to:
>
> 1) retrieve these docs from the 2nd index
> 2) facet on the results
>
> I can think of 3 solutions:
>
> 1) boolean query
> 2) terms fq
> 3) use a DB rather than Solr
>
> I am trying to keep latencies down so prefer to not use (3). The problem
> with (1) is maxBooleanclauses is hardwired and I am not sure when I will
> hit the exception. Option (2) seems to also hit limits.. so if I do
>
> select?fl=*&q=*:*&facet=true&facet.field=title&fq={!terms
> f=id}<LONG_LIST_OF_IDS>
>
> solr just goes blank. I have tried adding cost=200 to try to run the query
> first fq={!terms f=id cost=200} but still no good. Paging on doc IDs could
> be a solution but the problem then is that the faceting results correspond
> to the paged IDs and not the global set.
>
> My filter cache spec is as follows
>
>   <filterCache class="solr.FastLRUCache"
>                  size="1000000"
>                  initialSize="1000000"
>                  autowarmCount="100000"/>
>
>
> What would be the best way for me to solve this problem?
>
> thank you
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message