lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joe Obernberger <joseph.obernber...@gmail.com>
Subject Re: Largest number of indexed documents used by Solr
Date Thu, 05 Apr 2018 20:41:00 GMT
50 billion per day?  Wow!  How large are these documents?

We have a cluster with one large collection that contains 2.4 billion 
documents spread across 40 machines using HDFS for the index.  We store 
our data inside of HBase, and in order to re-index data we pull from 
HBase and index with solr cloud.  Most we can do is around 57 million 
per day; usually limited by pulling data out of HBase not Solr.

-Joe


On 4/4/2018 10:57 PM, 苗海泉 wrote:
> When we have 49 shards per collection, there are more than 600 collections.
> Solr will have serious performance problems. I don't know how to deal with
> them. My advice to you is to minimize the number of collections.
> Our environment is 49 solr server nodes, each with 32cpu/128g, and the data
> volume is about 50 billion per day.
>
>
> ‌
> <https://mailtrack.io/> Sent with Mailtrack
> <https://mailtrack.io?utm_source=gmail&utm_medium=signature&utm_campaign=signaturevirality&>
>
> 2018-04-04 9:23 GMT+08:00 Yago Riveiro <yago.riveiro@gmail.com>:
>
>> Hi,
>>
>> In my company we are running a 12 node cluster with 10 (american) Billion
>> documents 12 shards / 2 replicas.
>>
>> We do mainly faceting queries with a very reasonable performance.
>>
>> 36 million documents it's not an issue, you can handle that volume of
>> documents with 2 nodes with SSDs and 32G of ram
>>
>> Regards.
>>
>> --
>>
>> Yago Riveiro
>>
>> On 4 Apr 2018 02:15 +0100, Abhi Basu <9000revs@gmail.com>, wrote:
>>> We have tested Solr 4.10 with 200 million docs with avg doc size of 250
>> KB.
>>> No issues with performance when using 3 shards / 2 replicas.
>>>
>>>
>>>
>>> On Tue, Apr 3, 2018 at 8:12 PM, Steven White <swhite4141@gmail.com>
>> wrote:
>>>> Hi everyone,
>>>>
>>>> I'm about to start a project that requires indexing 36 million records
>>>> using Solr 7.2.1. Each record range from 500 KB to 0.25 MB where the
>>>> average is 0.1 MB.
>>>>
>>>> Has anyone indexed this number of records? What are the things I should
>>>> worry about? And out of curiosity, what is the largest number of
>> records
>>>> that Solr has indexed which is published out there?
>>>>
>>>> Thanks
>>>>
>>>> Steven
>>>>
>>>
>>>
>>> --
>>> Abhi Basu
>
>


Mime
View raw message