ignite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ilya Kasnacheev <ilya.kasnach...@gmail.com>
Subject Re: Compression prototype
Date Thu, 23 Aug 2018 12:17:46 GMT
Hello!

My plan was to add a compression section to cache configuration, where you
can enable compression, enable key compression (which has heavier
performance implications), adjust dictionary gathering settings, and in the
future possibly choose betwen algorithms. In fact I'm not sure, since my
assumption is that you can always just use latest&greatest, but maybe we
can have e.g. very fast and not very strong vs. slower but stronger one.

I'm not sure yet if we should share dictionary between all caches vs.
having separate dictionary for every cache.

With regards to data format, of course there will be room for further
extension.

Regards,

-- 
Ilya Kasnacheev

2018-08-23 15:13 GMT+03:00 Sergey Kozlov <skozlov@gridgain.com>:

> Hi Ilya
>
> Is there a plan to introduce it as an option of Ignite configuration? In
> that instead the boolean type I suggest to use the enum and reserve the
> ability to extend compressions algorithms in future
>
> On Thu, Aug 23, 2018 at 1:09 PM, Ilya Kasnacheev <
> ilya.kasnacheev@gmail.com>
> wrote:
>
> > Hello!
> >
> > I want to share with the developer community my compression prototype.
> >
> > Long story short, it compresses BinaryObject's byte[] as they are written
> > to Durable Memory page, operating on a pre-built dictionary. Typical
> > compression ratio is 0.4 (meaning 2.5x compression) using custom
> > LZW+Huffman. Metadata, indexes and primitive values are unaffected
> > entirely.
> >
> > This is akin to DB2's table-level compression[1] but independently
> > invented.
> >
> > On Yardstick tests performance hit is -6% with PDS and up to -25% (in
> > throughput) with In-Memory loads. It also means you can fit ~twice as
> much
> > data into the same IM cluster, or have higher ram/disk ratio with PDS
> > cluster, saving on hardware or decreasing latency.
> >
> > The code is available as PR 4295[2] (set IGNITE_ENABLE_COMPRESSION=true
> to
> > activate). Note that it will not presently survive a PDS node restart.
> > The impact is very small, the patch should be applicable to most 2.x
> > releases.
> >
> > Sure there's a long way before this prototype can have hope of being
> > included, but first I would like to hear input from fellow igniters.
> >
> > See also IEP-20[3].
> >
> > 1.
> > https://www.ibm.com/support/knowledgecenter/en/SSEPGG_10.
> > 5.0/com.ibm.db2.luw.admin.dbobj.doc/doc/c0052331.html
> > 2. https://github.com/apache/ignite/pull/4295
> > 3.
> > https://cwiki.apache.org/confluence/display/IGNITE/IEP-
> > 20%3A+Data+Compression+in+Ignite
> >
> > Regards,
> >
> > --
> > Ilya Kasnacheev
> >
>
>
>
> --
> Sergey Kozlov
> GridGain Systems
> www.gridgain.com
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message