lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nitin Solanki <nitinml...@gmail.com>
Subject Re: Whole RAM consumed while Indexing.
Date Wed, 18 Mar 2015 16:09:38 GMT
When I kept my configuration to 300 for soft commit and 3000 for hard
commit and indexed some amount of data, I got the data size of the whole
index to be 6GB after completing the indexing.

When I changed the configuration to 60000 for soft commit and 60000 for
hard commit and indexed same data then I got the data size of the whole
index to be 5GB after completing the indexing.

But the number of documents in the both scenario were same. I am wondering
how that can be possible?

On Wed, Mar 18, 2015 at 9:14 PM, Nitin Solanki <nitinmlvya@gmail.com> wrote:

> Hi Erick,
>              I am just saying. I want to be sure on commits difference..
> What if I do frequent commits or not? And why I am saying that I need to
> commit things so very quickly because I have to index 28GB of data which
> takes 7-8 hours(frequent commits).
> As you said, do commits after 60000 seconds then it will be more expensive.
> If I don't encounter with **"overlapping searchers" warning messages**
> then I feel it seems to be okay. Is it?
>
>
>
>
> On Wed, Mar 18, 2015 at 8:54 PM, Erick Erickson <erickerickson@gmail.com>
> wrote:
>
>> Don't do it. Really, why do you want to do this? This seems like
>> an "XY" problem, you haven't explained why you need to commit
>> things so very quickly.
>>
>> I suspect you haven't tried _searching_ while committing at such
>> a rate, and you might as well turn all your top-level caches off
>> in solrconfig.xml since they won't be useful at all.
>>
>> Best,
>> Erick
>>
>> On Wed, Mar 18, 2015 at 6:24 AM, Nitin Solanki <nitinmlvya@gmail.com>
>> wrote:
>> > Hi,
>> >        If I do very very fast indexing(softcommit = 300 and hardcommit =
>> > 3000) v/s slow indexing (softcommit = 60000 and hardcommit = 60000) as
>> you
>> > both said. Will fast indexing fail to index some data?
>> > Any suggestion on this ?
>> >
>> > On Tue, Mar 17, 2015 at 2:29 AM, Ramkumar R. Aiyengar <
>> > andyetitmoves@gmail.com> wrote:
>> >
>> >> Yes, and doing so is painful and takes lots of people and hardware
>> >> resources to get there for large amounts of data and queries :)
>> >>
>> >> As Erick says, work backwards from 60s and first establish how high the
>> >> commit interval can be to satisfy your use case..
>> >> On 16 Mar 2015 16:04, "Erick Erickson" <erickerickson@gmail.com>
>> wrote:
>> >>
>> >> > First start by lengthening your soft and hard commit intervals
>> >> > substantially. Start with 60000 and work backwards I'd say.
>> >> >
>> >> > Ramkumar has tuned the heck out of his installation to get the commit
>> >> > intervals to be that short ;).
>> >> >
>> >> > I'm betting that you'll see your RAM usage go way down, but that' s
a
>> >> > guess until you test.
>> >> >
>> >> > Best,
>> >> > Erick
>> >> >
>> >> > On Sun, Mar 15, 2015 at 10:56 PM, Nitin Solanki <
>> nitinmlvya@gmail.com>
>> >> > wrote:
>> >> > > Hi Erick,
>> >> > >             You are saying correct. Something, **"overlapping
>> >> searchers"
>> >> > > warning messages** are coming in logs.
>> >> > > **numDocs numbers** are changing when documents are adding at
the
>> time
>> >> of
>> >> > > indexing.
>> >> > > Any help?
>> >> > >
>> >> > > On Sat, Mar 14, 2015 at 11:24 PM, Erick Erickson <
>> >> > erickerickson@gmail.com>
>> >> > > wrote:
>> >> > >
>> >> > >> First, the soft commit interval is very short. Very, very,
very,
>> very
>> >> > >> short. 300ms is
>> >> > >> just short of insane unless it's a typo ;).
>> >> > >>
>> >> > >> Here's a long background:
>> >> > >>
>> >> > >>
>> >> >
>> >>
>> https://lucidworks.com/blog/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/
>> >> > >>
>> >> > >> But the short form is that you're opening searchers every
300 ms.
>> The
>> >> > >> hard commit is better,
>> >> > >> but every 3 seconds is still far too short IMO. I'd start
with
>> soft
>> >> > >> commits of 60000 and hard
>> >> > >> commits of 60000 (60 seconds), meaning that you're going to
have
>> to
>> >> > >> wait 1 minute for
>> >> > >> docs to show up unless you explicitly commit.
>> >> > >>
>> >> > >> You're throwing away all the caches configured in solrconfig.xml
>> more
>> >> > >> than 3 times a second,
>> >> > >> executing autowarming, etc, etc, etc....
>> >> > >>
>> >> > >> Changing these to longer intervals might cure the problem,
but if
>> not
>> >> > >> then, as Hoss would
>> >> > >> say, "details matter". I suspect you're also seeing "overlapping
>> >> > >> searchers" warning messages
>> >> > >> in your log, and it;s _possible_ that what's happening is
that
>> you're
>> >> > >> just exceeding the
>> >> > >> max warming searchers and never opening a new searcher with
the
>> >> > >> newly-indexed documents.
>> >> > >> But that's a total shot in the dark.
>> >> > >>
>> >> > >> How are you looking for docs (and not finding them)? Does
the
>> numDocs
>> >> > >> number in
>> >> > >> the solr admin screen change?
>> >> > >>
>> >> > >>
>> >> > >> Best,
>> >> > >> Erick
>> >> > >>
>> >> > >> On Thu, Mar 12, 2015 at 10:27 PM, Nitin Solanki <
>> nitinmlvya@gmail.com
>> >> >
>> >> > >> wrote:
>> >> > >> > Hi Alexandre,
>> >> > >> >
>> >> > >> >
>> >> > >> > *Hard Commit* is :
>> >> > >> >
>> >> > >> >      <autoCommit>
>> >> > >> >        <maxTime>${solr.autoCommit.maxTime:3000}</maxTime>
>> >> > >> >        <openSearcher>false</openSearcher>
>> >> > >> >      </autoCommit>
>> >> > >> >
>> >> > >> > *Soft Commit* is :
>> >> > >> >
>> >> > >> > <autoSoftCommit>
>> >> > >> >     <maxTime>${solr.autoSoftCommit.maxTime:300}</maxTime>
>> >> > >> > </autoSoftCommit>
>> >> > >> >
>> >> > >> > And I am committing 20000 documents each time.
>> >> > >> > Is it good config for committing?
>> >> > >> > Or I am good something wrong ?
>> >> > >> >
>> >> > >> >
>> >> > >> > On Fri, Mar 13, 2015 at 8:52 AM, Alexandre Rafalovitch
<
>> >> > >> arafalov@gmail.com>
>> >> > >> > wrote:
>> >> > >> >
>> >> > >> >> What's your commit strategy? Explicit commits? Soft
>> commits/hard
>> >> > >> >> commits (in solrconfig.xml)?
>> >> > >> >>
>> >> > >> >> Regards,
>> >> > >> >>    Alex.
>> >> > >> >> ----
>> >> > >> >> Solr Analyzers, Tokenizers, Filters, URPs and even
a
>> newsletter:
>> >> > >> >> http://www.solr-start.com/
>> >> > >> >>
>> >> > >> >>
>> >> > >> >> On 12 March 2015 at 23:19, Nitin Solanki <nitinmlvya@gmail.com
>> >
>> >> > wrote:
>> >> > >> >> > Hello,
>> >> > >> >> >           I have written a python script to
do 20000
>> documents
>> >> > >> indexing
>> >> > >> >> > each time on Solr. I have 28 GB RAM with 8 CPU.
>> >> > >> >> > When I started indexing, at that time 15 GB
RAM was freed.
>> While
>> >> > >> >> indexing,
>> >> > >> >> > all RAM is consumed but **not** a single document
is
>> indexed. Why
>> >> > so?
>> >> > >> >> > And it through *HTTPError: HTTP Error 503: Service
>> Unavailable*
>> >> in
>> >> > >> python
>> >> > >> >> > script.
>> >> > >> >> > I think it is due to heavy load on Zookeeper
by which all
>> nodes
>> >> > went
>> >> > >> >> down.
>> >> > >> >> > I am not sure about that. Any help please..
>> >> > >> >> > Or anything else is happening..
>> >> > >> >> > And how to overcome this issue.
>> >> > >> >> > Please assist me towards right path.
>> >> > >> >> > Thanks..
>> >> > >> >> >
>> >> > >> >> > Warm Regards,
>> >> > >> >> > Nitin Solanki
>> >> > >> >>
>> >> > >>
>> >> >
>> >>
>>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message