lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rohit Kanchan <rohitkan2...@gmail.com>
Subject Re: Solr Delete By Id Out of memory issue
Date Mon, 03 Apr 2017 22:29:28 GMT
Thanks everyone for replying to this issue. Just a final comment on this
issue which I was closely working on. We have fixed this issue. It was a
bug in our custom component which we wrote to convert delete by query to
delete by id. We were using BytesRef differently, we were not making a deep
copy. It was causing OOM. We have changed that and now making a deep copy.
Now It seems it is restricting old deletes map to capacity 1K.

After deployment of this change, we took another heap dump and did not find
this as leak suspects.  Please let me know if anyone have questions.

Thanks
Rohit


On Mon, Mar 27, 2017 at 11:56 AM, Rohit Kanchan <rohitkan2000@gmail.com>
wrote:

> Thanks Erick for replying back. I have deployed changes to production, we
> will figure it out soon if it is still causing OOM or not. And for commits
> we are doing auto commits after 10K docs or 30 secs.
> If I get time I will try to run a local test to check if we will hit OOM
> because of 1K map entries or not. I will update this thread about my
> findings. I really appreciate yours and Chris response.
>
> Thanks
> Rohit
>
>
> On Mon, Mar 27, 2017 at 10:47 AM, Erick Erickson <erickerickson@gmail.com>
> wrote:
>
>> Rohit:
>>
>> Well, whenever I see something like "I have this custom component..."
>> I immediately want the problem to be demonstrated without that custom
>> component before trying to debug Solr.
>>
>> As Chris explained, we can't clear the 1K entries. It's hard to
>> imagine why keeping the last 1,000 entries around would cause OOMs.
>>
>> You haven't demonstrated yet that after your latest change you still
>> get OOMs, you've just assumed so. After running for a "long time" do
>> you still see the problem after your changes?
>>
>> So before assuming it's a Solr bug, and after you demonstrate that
>> your latest change didn't solve the problem, you should try two
>> things:
>>
>> 1> as I suggested and Chris endorsed, try committing upon occasion
>> from your custom component. Or set your autocommit settings
>> appropriately if you haven't already.
>>
>> 2> run your deletes from the client as a test. You've created a custom
>> URP component because you "didn't want to run the queries from the
>> client". That's perfectly reasonable, it's just that to know where you
>> should be looking deleting from the client would eliminate your custom
>> code and tell us where to focus.
>>
>> Best,
>> Erick
>>
>>
>>
>> On Sat, Mar 25, 2017 at 1:21 PM, Rohit Kanchan <rohitkan2000@gmail.com>
>> wrote:
>> > I think we figure out the issue, When we were conventing delete by
>> query in
>> > a Solr Handler we were not making a deep copy of BytesRef. We were
>> making
>> > reference of same object, which was causing old deletes(LinkedHasmap)
>> > adding more than 1K entries.
>> >
>> > But I think it is still not clearing those 1K entries. Eventually it
>> will
>> > throw OOM because UpdateLog is not singleton and when there will be many
>> > delete by id and server is not re started for very long time then
>> > eventually throw OOM. I think we should clear this map when we are
>> > committing. I am not a committer,  it would be great if I get reply
>> from a
>> > committer.  What do you guys think?
>> >
>> > Thanks
>> > Rohit
>> >
>> >
>> > On Wed, Mar 22, 2017 at 1:36 PM, Rohit Kanchan <rohitkan2000@gmail.com>
>> > wrote:
>> >
>> >> For commits we are relying on auto commits. We have define following in
>> >> configs:
>> >>
>> >>        <autoCommit>
>> >>
>> >>             <maxDocs>10000</maxDocs>
>> >>
>> >>             <maxTime>30000</maxTime>
>> >>
>> >>             <openSearcher>false</openSearcher>
>> >>
>> >>         </autoCommit>
>> >>
>> >>         <autoSoftCommit>
>> >>
>> >>             <maxTime>15000</maxTime>
>> >>
>> >>         </autoSoftCommit>
>> >>
>> >> One thing which I would like to mention is that we are not calling
>> >> directly deleteById from client. We have created an  update chain and
>> added
>> >> a processor there. In this processor we are querying first and
>> collecting
>> >> all byteRefHash and get each byteRef out of it and set it to indexedId.
>> >> After collecting indexedId we are using those ids to call delete byId.
>> We
>> >> are doing this because we do not want query solr before deleting at
>> client
>> >> side. It is possible that there is a bug in this code but I am not
>> sure,
>> >> because when I run tests in my local it is not showing any issues. I am
>> >> trying to remote debug now.
>> >>
>> >> Thanks
>> >> Rohit
>> >>
>> >>
>> >> On Wed, Mar 22, 2017 at 9:57 AM, Chris Hostetter <
>> hossman_lucene@fucit.org
>> >> > wrote:
>> >>
>> >>>
>> >>> : OK, The whole DBQ thing baffles the heck out of me so this may be
>> >>> : totally off base. But would committing help here? Or at least be
>> worth
>> >>> : a test?
>> >>>
>> >>> ths isn't DBQ -- the OP specifically said deleteById, and that the
>> >>> oldDeletes map (only used for DBI) was the problem acording to the
>> heap
>> >>> dumps they looked at.
>> >>>
>> >>> I suspect you are correct about the root cause of the OOMs ...
>> perhaps the
>> >>> OP isn't using hard/soft commits effectively enough and the
>> uncommitted
>> >>> data is what's causing the OOM ... hard to say w/o more details. or
>> >>> confirmation of exactly what the OP was looking at in their claim
>> below
>> >>> about the heap dump....
>> >>>
>> >>>
>> >>> : > : Thanks for replying. We are using Solr 6.1 version. Even I
saw
>> that
>> >>> it is
>> >>> : > : bounded by 1K count, but after looking at heap dump I was amazed
>> >>> how can it
>> >>> : > : keep more than 1K entries. But Yes I see around 7M entries
>> >>> according to
>> >>> : > : heap dump and around 17G of memory occupied by BytesRef there.
>> >>> : >
>> >>> : > what exactly are you looking at when you say you see "7M entries"
>> ?
>> >>> : >
>> >>> : > are you sure you aren't confusing the keys in oldDeletes with
>> other
>> >>> : > instances of BytesRef in the JVM?
>> >>>
>> >>>
>> >>> -Hoss
>> >>> http://www.lucidworks.com/
>> >>>
>> >>
>> >>
>>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message