ignite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alexey Goncharuk <alexey.goncha...@gmail.com>
Subject Re: How to free up space on disc after removing entries from IgniteCache with enabled PDS?
Date Wed, 18 Sep 2019 17:03:38 GMT
Denis,

It's not fundamental, but quite complex. In postgres, for example, this is
not maintained automatically and store compaction is performed using the
full vacuum command, which acquires exclusive table lock, so no concurrent
activities on the table are possible.

The solution which Anton suggested does not look easy because it will most
likely significantly hurt performance: it is hard to maintain a data
structure to choose "page from free-list with enough space closest to the
beginning of the file". Overall, we can think of something similar to
postgres, when a space can be freed in some maintenance mode.

Online space cleanup sounds tricky for me, or at least I cannot think about
a plausible solution right away.

--AG

пт, 13 сент. 2019 г. в 19:43, Denis Magda <dmagda@apache.org>:

> The issue starts hitting others who deploy Ignite persistence in
> production:
> https://issues.apache.org/jira/browse/IGNITE-12152
>
> Alex, I'm curious is this a fundamental problem. Asked the same question in
> JIRA but, probably, this discussion is a better place to get to the bottom
> first:
> https://issues.apache.org/jira/browse/IGNITE-10862
>
> -
> Denis
>
>
> On Thu, Jan 10, 2019 at 6:01 AM Anton Vinogradov <av@apache.org> wrote:
>
> > Dmitriy,
> >
> > This does not look like a production-ready case :)
> >
> > How about
> > 1) Once you need to write an entry - you have to chose not random "page
> > from free-list with enough space"
> > but "page from free-list with enough space closest to the beginning of
> the
> > file".
> >
> > 2) Once you remove entry you have to merge the rest of the entries at
> this
> > page to the
> > "page from free-list with enough space closest to the beginning of the
> > file"
> > if possible. (optional)
> >
> > 3) Partition file tail with empty pages can bу removed at any time.
> >
> > 4) In case you have cold data inside the tail, just lock the page and
> > perform migration to
> > "page from free-list with enough space closest to the beginning of the
> > file".
> > This operation can be scheduled.
> >
> > On Wed, Jan 9, 2019 at 4:43 PM Dmitriy Pavlov <dpavlov@apache.org>
> wrote:
> >
> > > In the TC Bot, I used to create the second cache with CacheV2 name and
> > > migrate needed data from Cache  V1 to V2.
> > >
> > > After CacheV1 destroy(), files are removed and disk space is freed.
> > >
> > > ср, 9 янв. 2019 г. в 12:04, Павлухин Иван <vololo100@gmail.com>:
> > >
> > > > Vyacheslav,
> > > >
> > > > Have you investigated how other vendors (Oracle, Postgres) tackle
> this
> > > > problem?
> > > >
> > > > I have one wild idea. Could the problem be solved by stopping a node
> > > > which need to be defragmented, clearing persistence files and
> > > > restarting the node? After rebalance the node will receive all data
> > > > back without fragmentation. I see a big downside -- sending data
> > > > across the network. But perhaps we can play with affinity and start
> > > > new node on the same host which will receive the same data, after
> that
> > > > old node can be stopped. It looks more as kind of workaround but
> > > > perhaps it can be turned into workable solution.
> > > >
> > > > ср, 9 янв. 2019 г. в 10:49, Vyacheslav Daradur <daradurvs@gmail.com
> >:
> > > > >
> > > > > Yes, it's about Page Memory defragmentation.
> > > > >
> > > > > Pages in partitions files are stored sequentially, possible, it
> makes
> > > > > sense to defragment pages first to avoid interpages gaps since we
> use
> > > > > pages offset to manage them.
> > > > >
> > > > > I filled an issue [1], I hope we will be able to find resources to
> > > > > solve the issue before 2.8 release.
> > > > >
> > > > > [1] https://issues.apache.org/jira/browse/IGNITE-10862
> > > > >
> > > > > On Sat, Dec 29, 2018 at 10:47 AM Павлухин Иван <
> vololo100@gmail.com>
> > > > wrote:
> > > > > >
> > > > > > I suppose it is about Ignite Page Memory pages defragmentation.
> > > > > >
> > > > > > We can get 100 allocated pages each of which becomes only e.g.
> 50%
> > > > > > filled after removal some entries. But they will occupy a space
> for
> > > > > > 100 pages on a hard drive.
> > > > > >
> > > > > > пт, 28 дек. 2018 г. в 20:45, Denis Magda <dmagda@apache.org>:
> > > > > > >
> > > > > > > Shouldn't the OS care of defragmentation? What we need
to do is
> > to
> > > > give a
> > > > > > > way to remove stale data and "release" the allocated space
> > somehow
> > > > through
> > > > > > > the tools, MBeans or API methods.
> > > > > > >
> > > > > > > --
> > > > > > > Denis
> > > > > > >
> > > > > > >
> > > > > > > On Fri, Dec 28, 2018 at 6:24 AM Vladimir Ozerov <
> > > > vozerov@gridgain.com>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Hi Vyacheslav,
> > > > > > > >
> > > > > > > > AFAIK this is not implemented. Shrinking/defragmentation
is
> > > > important
> > > > > > > > optimization. Not only because it releases free space,
but
> also
> > > > because it
> > > > > > > > decreases total number of pages. But is it not very
easy to
> > > > implement, as
> > > > > > > > you have to both reshuffle data entries and index
entries,
> > > > maintaining
> > > > > > > > consistency for concurrent reads and updates at the
same
> time.
> > Or
> > > > > > > > alternatively we can think of offline defragmentation.
It
> will
> > be
> > > > easier to
> > > > > > > > implement and faster, but concurrent operations will
be
> > > prohibited.
> > > > > > > >
> > > > > > > > On Fri, Dec 28, 2018 at 4:08 PM Vyacheslav Daradur
<
> > > > daradurvs@gmail.com>
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Igniters, we have faced with the following problem
on one
> of
> > > our
> > > > > > > > > deployments.
> > > > > > > > >
> > > > > > > > > Let's imagine that we have used IgniteCache with
enabled
> PDS
> > > > during the
> > > > > > > > > time:
> > > > > > > > > - hardware disc space has been occupied during
growing up
> of
> > an
> > > > amount
> > > > > > > > > of data, e.g. 100Gb;
> > > > > > > > > - then, we removed non-actual data, e.g 50Gb,
which became
> > > > useless for
> > > > > > > > us;
> > > > > > > > > - disc space stopped growing up with new data,
but it was
> not
> > > > > > > > > released, and still took 100Gb, instead of expected
50Gb;
> > > > > > > > >
> > > > > > > > > Another use case:
> > > > > > > > > - a user extracts data from IgniteCache to store
it in
> > separate
> > > > > > > > > IgniteCache or another store;
> > > > > > > > > - disc still is occupied and the user is not
able to store
> > data
> > > > in the
> > > > > > > > > different cache at the same cluster because of
disc
> > limitation;
> > > > > > > > >
> > > > > > > > > How can we help the user to free up the disc
space, if an
> > > amount
> > > > of
> > > > > > > > > data in IgniteCache has been reduced many times
and will
> not
> > be
> > > > > > > > > increased in the nearest future?
> > > > > > > > >
> > > > > > > > > AFAIK, we have mechanics of reusing memory pages,
that
> allows
> > > us
> > > > to
> > > > > > > > > use pages which have been allocated and stored
removed data
> > for
> > > > > > > > > storing new data.
> > > > > > > > > Are there any chances to shrink data and free
up space on
> > disc
> > > > (with
> > > > > > > > > defragmentation if possible)?
> > > > > > > > >
> > > > > > > > > --
> > > > > > > > > Best Regards, Vyacheslav D.
> > > > > > > > >
> > > > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > Best regards,
> > > > > > Ivan Pavlukhin
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Best Regards, Vyacheslav D.
> > > >
> > > >
> > > >
> > > > --
> > > > Best regards,
> > > > Ivan Pavlukhin
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message