ignite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Denis Magda <dma...@apache.org>
Subject Re: Rebalancing - how to make it faster
Date Fri, 23 Mar 2018 19:14:35 GMT
Ilya,

That's a decent boost (5-20%) even having WAL enabled. Not sure that we
should stake on the WAL "off" mode here because if the whole cluster goes
down, it's then the data consistency is questionable. As an architect, I
wouldn't disable WAL for the sake of rebalancing; it's too risky.

If you agree, then let's create the IEP. This way it will be easier to
track this endeavor. BTW, are you already ready to release any
optimizations in 2.5 that is being discussed in a separate thread?

--
Denis



On Fri, Mar 23, 2018 at 6:37 AM, Ilya Lantukh <ilantukh@gridgain.com> wrote:

> Denis,
>
> > - Don't you want to aggregate the tickets under an IEP?
> Yes, I think so.
>
> > - Does it mean we're going to update our B+Tree implementation? Any ideas
> how risky it is?
> One of tickets that I created (
> https://issues.apache.org/jira/browse/IGNITE-7935) involves B+Tree
> modification, but I am not planning to do it in the nearest future. It
> shouldn't affect existing tree operations, only introduce new ones (putAll,
> invokeAll, removeAll).
>
> > - Any chance you had a prototype that shows performance optimizations the
> approach you are suggesting to take?
> I have a prototype for simplest improvements (https://issues.apache.org/
> jira/browse/IGNITE-8019 & https://issues.apache.org/
> jira/browse/IGNITE-8018)
> - together they increase throughput by 5-20%, depending on configuration
> and environment. Also, I've tested different WAL modes - switching from
> LOG_ONLY to NONE gives over 100% boost - this is what I expect from
> https://issues.apache.org/jira/browse/IGNITE-8017.
>
> On Thu, Mar 22, 2018 at 9:48 PM, Denis Magda <dmagda@apache.org> wrote:
>
> > Ilya,
> >
> > That's outstanding research and summary. Thanks for spending your time on
> > this.
> >
> > Not sure I have enough expertise to challenge your approach, but it
> sounds
> > 100% reasonable to me. As side notes:
> >
> >    - Don't you want to aggregate the tickets under an IEP?
> >    - Does it mean we're going to update our B+Tree implementation? Any
> >    ideas how risky it is?
> >    - Any chance you had a prototype that shows performance optimizations
> of
> >    the approach you are suggesting to take?
> >
> > --
> > Denis
> >
> > On Thu, Mar 22, 2018 at 8:38 AM, Ilya Lantukh <ilantukh@gridgain.com>
> > wrote:
> >
> > > Igniters,
> > >
> > > I've spent some time analyzing performance of rebalancing process. The
> > > initial goal was to understand, what limits it's throughput, because it
> > is
> > > significantly slower than network and storage device can theoretically
> > > handle.
> > >
> > > Turns out, our current implementation has a number of issues caused by
> a
> > > single fundamental problem.
> > >
> > > During rebalance data is sent in batches called
> > > GridDhtPartitionSupplyMessages. Batch size is configurable, default
> > value
> > > is 512KB, which could mean thousands of key-value pairs. However, we
> > don't
> > > take any advantage over this fact and process each entry independently:
> > > - checkpointReadLock is acquired multiple times for every entry,
> leading
> > to
> > > unnecessary contention - this is clearly a bug;
> > > - for each entry we write (and fsync, if configuration assumes it) a
> > > separate WAL record - so, if batch contains N entries, we might end up
> > > doing N fsyncs;
> > > - adding every entry into CacheDataStore also happens completely
> > > independently. It means, we will traverse and modify each index tree N
> > > times, we will allocate space in FreeList N times and we will have to
> > > additionally store in WAL O(N*log(N)) page delta records.
> > >
> > > I've created a few tickets in JIRA with very different levels of scale
> > and
> > > complexity.
> > >
> > > Ways to reduce impact of independent processing:
> > > - https://issues.apache.org/jira/browse/IGNITE-8019 - aforementioned
> > bug,
> > > causing contention on checkpointReadLock;
> > > - https://issues.apache.org/jira/browse/IGNITE-8018 - inefficiency in
> > > GridCacheMapEntry implementation;
> > > - https://issues.apache.org/jira/browse/IGNITE-8017 - automatically
> > > disable
> > > WAL during preloading.
> > >
> > > Ways to solve problem on more global level:
> > > - https://issues.apache.org/jira/browse/IGNITE-7935 - a ticket to
> > > introduce
> > > batch modification;
> > > - https://issues.apache.org/jira/browse/IGNITE-8020 - complete
> redesign
> > of
> > > rebalancing process for persistent caches, based on file transfer.
> > >
> > > Everyone is welcome to criticize above ideas, suggest new ones or
> > > participate in implementation.
> > >
> > > --
> > > Best regards,
> > > Ilya
> > >
> >
>
>
>
> --
> Best regards,
> Ilya
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message