ignite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Denis Magda <dma...@apache.org>
Subject Re: How to free up space on disc after removing entries from IgniteCache with enabled PDS?
Date Fri, 13 Sep 2019 16:42:57 GMT
The issue starts hitting others who deploy Ignite persistence in production:
https://issues.apache.org/jira/browse/IGNITE-12152

Alex, I'm curious is this a fundamental problem. Asked the same question in
JIRA but, probably, this discussion is a better place to get to the bottom
first:
https://issues.apache.org/jira/browse/IGNITE-10862

-
Denis


On Thu, Jan 10, 2019 at 6:01 AM Anton Vinogradov <av@apache.org> wrote:

> Dmitriy,
>
> This does not look like a production-ready case :)
>
> How about
> 1) Once you need to write an entry - you have to chose not random "page
> from free-list with enough space"
> but "page from free-list with enough space closest to the beginning of the
> file".
>
> 2) Once you remove entry you have to merge the rest of the entries at this
> page to the
> "page from free-list with enough space closest to the beginning of the
> file"
> if possible. (optional)
>
> 3) Partition file tail with empty pages can bу removed at any time.
>
> 4) In case you have cold data inside the tail, just lock the page and
> perform migration to
> "page from free-list with enough space closest to the beginning of the
> file".
> This operation can be scheduled.
>
> On Wed, Jan 9, 2019 at 4:43 PM Dmitriy Pavlov <dpavlov@apache.org> wrote:
>
> > In the TC Bot, I used to create the second cache with CacheV2 name and
> > migrate needed data from Cache  V1 to V2.
> >
> > After CacheV1 destroy(), files are removed and disk space is freed.
> >
> > ср, 9 янв. 2019 г. в 12:04, Павлухин Иван <vololo100@gmail.com>:
> >
> > > Vyacheslav,
> > >
> > > Have you investigated how other vendors (Oracle, Postgres) tackle this
> > > problem?
> > >
> > > I have one wild idea. Could the problem be solved by stopping a node
> > > which need to be defragmented, clearing persistence files and
> > > restarting the node? After rebalance the node will receive all data
> > > back without fragmentation. I see a big downside -- sending data
> > > across the network. But perhaps we can play with affinity and start
> > > new node on the same host which will receive the same data, after that
> > > old node can be stopped. It looks more as kind of workaround but
> > > perhaps it can be turned into workable solution.
> > >
> > > ср, 9 янв. 2019 г. в 10:49, Vyacheslav Daradur <daradurvs@gmail.com>:
> > > >
> > > > Yes, it's about Page Memory defragmentation.
> > > >
> > > > Pages in partitions files are stored sequentially, possible, it makes
> > > > sense to defragment pages first to avoid interpages gaps since we use
> > > > pages offset to manage them.
> > > >
> > > > I filled an issue [1], I hope we will be able to find resources to
> > > > solve the issue before 2.8 release.
> > > >
> > > > [1] https://issues.apache.org/jira/browse/IGNITE-10862
> > > >
> > > > On Sat, Dec 29, 2018 at 10:47 AM Павлухин Иван <vololo100@gmail.com>
> > > wrote:
> > > > >
> > > > > I suppose it is about Ignite Page Memory pages defragmentation.
> > > > >
> > > > > We can get 100 allocated pages each of which becomes only e.g. 50%
> > > > > filled after removal some entries. But they will occupy a space for
> > > > > 100 pages on a hard drive.
> > > > >
> > > > > пт, 28 дек. 2018 г. в 20:45, Denis Magda <dmagda@apache.org>:
> > > > > >
> > > > > > Shouldn't the OS care of defragmentation? What we need to do
is
> to
> > > give a
> > > > > > way to remove stale data and "release" the allocated space
> somehow
> > > through
> > > > > > the tools, MBeans or API methods.
> > > > > >
> > > > > > --
> > > > > > Denis
> > > > > >
> > > > > >
> > > > > > On Fri, Dec 28, 2018 at 6:24 AM Vladimir Ozerov <
> > > vozerov@gridgain.com>
> > > > > > wrote:
> > > > > >
> > > > > > > Hi Vyacheslav,
> > > > > > >
> > > > > > > AFAIK this is not implemented. Shrinking/defragmentation
is
> > > important
> > > > > > > optimization. Not only because it releases free space,
but also
> > > because it
> > > > > > > decreases total number of pages. But is it not very easy
to
> > > implement, as
> > > > > > > you have to both reshuffle data entries and index entries,
> > > maintaining
> > > > > > > consistency for concurrent reads and updates at the same
time.
> Or
> > > > > > > alternatively we can think of offline defragmentation.
It will
> be
> > > easier to
> > > > > > > implement and faster, but concurrent operations will be
> > prohibited.
> > > > > > >
> > > > > > > On Fri, Dec 28, 2018 at 4:08 PM Vyacheslav Daradur <
> > > daradurvs@gmail.com>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Igniters, we have faced with the following problem
on one of
> > our
> > > > > > > > deployments.
> > > > > > > >
> > > > > > > > Let's imagine that we have used IgniteCache with enabled
PDS
> > > during the
> > > > > > > > time:
> > > > > > > > - hardware disc space has been occupied during growing
up of
> an
> > > amount
> > > > > > > > of data, e.g. 100Gb;
> > > > > > > > - then, we removed non-actual data, e.g 50Gb, which
became
> > > useless for
> > > > > > > us;
> > > > > > > > - disc space stopped growing up with new data, but
it was not
> > > > > > > > released, and still took 100Gb, instead of expected
50Gb;
> > > > > > > >
> > > > > > > > Another use case:
> > > > > > > > - a user extracts data from IgniteCache to store it
in
> separate
> > > > > > > > IgniteCache or another store;
> > > > > > > > - disc still is occupied and the user is not able
to store
> data
> > > in the
> > > > > > > > different cache at the same cluster because of disc
> limitation;
> > > > > > > >
> > > > > > > > How can we help the user to free up the disc space,
if an
> > amount
> > > of
> > > > > > > > data in IgniteCache has been reduced many times and
will not
> be
> > > > > > > > increased in the nearest future?
> > > > > > > >
> > > > > > > > AFAIK, we have mechanics of reusing memory pages,
that allows
> > us
> > > to
> > > > > > > > use pages which have been allocated and stored removed
data
> for
> > > > > > > > storing new data.
> > > > > > > > Are there any chances to shrink data and free up space
on
> disc
> > > (with
> > > > > > > > defragmentation if possible)?
> > > > > > > >
> > > > > > > > --
> > > > > > > > Best Regards, Vyacheslav D.
> > > > > > > >
> > > > > > >
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Best regards,
> > > > > Ivan Pavlukhin
> > > >
> > > >
> > > >
> > > > --
> > > > Best Regards, Vyacheslav D.
> > >
> > >
> > >
> > > --
> > > Best regards,
> > > Ivan Pavlukhin
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message