ignite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vladimir Ozerov <voze...@gridgain.com>
Subject Re: Ignite as distributed file storage
Date Mon, 02 Jul 2018 13:50:27 GMT
Pavel,

Thank you. I'll wait for feature comparison and concrete use cases, because
for me this feature still sounds too abstract to judge whether product
would benefit from it.

On Mon, Jul 2, 2018 at 3:15 PM Pavel Kovalenko <jokserfn@gmail.com> wrote:

> Dmitriy,
>
> I think we have a little miscommunication here. Of course, I meant
> supporting large entries / chunks of binary data. Internally it will be
> BLOB storage, which can be accessed through various interfaces.
> "File" is just an abstraction for an end user for convenience, a wrapper
> layer to have user-friendly API to directly store BLOBs. We shouldn't
> support full file protocol support with file system capabilities. It can be
> added later, but now it's absolutely unnecessary and introduces extra
> complexity.
>
> We can implement our BLOB storage step by step. The first thing is
> core functionality and support to save large parts of binary objects to it.
> "File" layer, Web layer, etc. can be added later.
>
> The initial IGFS design doesn't have good capabilities to have a
> persistence layer. I think we shouldn't do any changes to it, this project
> as for me is almost outdated. We will drop IGFS after implementing File
> System layer over our BLOB storage.
>
> Vladimir,
>
> I will prepare a comparison with other existing distributed file storages
> and file systems in a few days.
>
> About usage data grid, I never said, that we need transactions, sync backup
> and etc. We need just a few core things - Atomic cache with persistence,
> Discovery, Baseline, Affinity, and Communication.
> Other things we can implement by ourselves. So this feature can develop
> independently of other non-core features.
> For me Ignite way is providing to our users a fast and convenient way to
> solve their problems with good performance and durability. We have the
> problem with storing large data, we should solve it.
> About other things see my message to Dmitriy above.
>
> вс, 1 июл. 2018 г. в 9:48, Dmitriy Setrakyan <dsetrakyan@apache.org>:
>
> > Pavel,
> >
> > I have actually misunderstood the use case. To be honest, I thought that
> > you were talking about the support of large values in Ignite caches, e.g.
> > objects that are several megabytes in cache.
> >
> > If we are tackling the distributed file system, then in my view, we
> should
> > be talking about IGFS and adding persistence support to IGFS (which is
> > based on HDFS API). It is not clear to me that you are talking about
> IGFS.
> > Can you confirm?
> >
> > D.
> >
> >
> > On Sat, Jun 30, 2018 at 10:59 AM, Pavel Kovalenko <jokserfn@gmail.com>
> > wrote:
> >
> > > Dmitriy,
> > >
> > > Yes, I have approximate design in my mind. The main idea is that we
> > already
> > > have distributed cache for files metadata (our Atomic cache), the data
> > flow
> > > and distribution will be controlled by our AffinityFunction and
> Baseline.
> > > We're already have discovery and communication to make such local files
> > > storages to be synced. The files data will be separated to large blocks
> > > (64-128Mb) (which looks very similar to our WAL). Each block can
> contain
> > > one or more file chunks. The tablespace (segment ids, offsets and etc.)
> > > will be stored to our regular page memory. This is key ideas to
> implement
> > > first version of such storage. We already have similiar components in
> our
> > > persistence, so this experience can be reused to develop such storage.
> > >
> > > Denis,
> > >
> > > Nothing significant should be changed at our memory level. It will be
> > > separate, pluggable component over cache. Most of the functions which
> > give
> > > performance boost can be delegated to OS level (Memory mapped files,
> DMA,
> > > Direct write from Socket to disk and vice versa). Ignite and File
> Storage
> > > can develop independetly of each other.
> > >
> > > Alexey Stelmak, which has a great experience with developing such
> systems
> > > can provide more low level information about how it should look.
> > >
> > > сб, 30 июн. 2018 г. в 19:40, Dmitriy Setrakyan <dsetrakyan@apache.org
> >:
> > >
> > > > Pavel, it definitely makes sense. Do you have a design in mind?
> > > >
> > > > D.
> > > >
> > > > On Sat, Jun 30, 2018, 07:24 Pavel Kovalenko <jokserfn@gmail.com>
> > wrote:
> > > >
> > > > > Igniters,
> > > > >
> > > > > I would like to start a discussion about designing a new feature
> > > because
> > > > I
> > > > > think it's time to start making steps towards it.
> > > > > I noticed, that some of our users have tried to store large
> > homogenous
> > > > > entries (> 1, 10, 100 Mb/Gb/Tb) to our caches, but without big
> > success.
> > > > >
> > > > > IGFS project has the possibility to do it, but as for me it has one
> > big
> > > > > disadvantage - it's in-memory only, so users have a strict size
> limit
> > > of
> > > > > their data and have data loss problem.
> > > > >
> > > > > Our durable memory has a possibility to persist a data that doesn't
> > fit
> > > > to
> > > > > RAM to disk, but page structure of it is not supposed to store
> large
> > > > pieces
> > > > > of data.
> > > > >
> > > > > There are a lot of projects of distributed file systems like HDFS,
> > > > > GlusterFS, etc. But all of them concentrate to implement high-grade
> > > file
> > > > > protocol, rather than user-friendly API which leads to high entry
> > > > threshold
> > > > > to start implementing something over it.
> > > > > We shouldn't go in this way. Our main goal should be providing to
> > user
> > > > easy
> > > > > and fast way to use file storage and processing here and now.
> > > > >
> > > > > If take HDFS as closest possible by functionality project, we have
> > one
> > > > big
> > > > > advantage against it. We can use our caches as files metadata
> storage
> > > and
> > > > > have the infinite possibility to scale it, while HDFS is bounded
by
> > > > > Namenode capacity and has big problems with keeping a large number
> of
> > > > files
> > > > > in the system.
> > > > >
> > > > > We achieved very good experience with persistence when we developed
> > our
> > > > > durable memory, and we can couple together it and experience with
> > > > services,
> > > > > binary protocol, I/O and start to design a new IEP.
> > > > >
> > > > > Use cases and features of the project:
> > > > > 1) Storing XML, JSON, BLOB, CLOB, images, videos, text, etc without
> > > > > overhead and data loss possibility.
> > > > > 2) Easy, pluggable, fast and distributed file processing,
> > > transformation
> > > > > and analysis. (E.g. ImageMagick processor for images
> transformation,
> > > > > LuceneIndex for texts, whatever, it's bounded only by your
> > > imagination).
> > > > > 3) Scalability out of the box.
> > > > > 4) User-friendly API and minimal steps to start using this storage
> in
> > > > > production.
> > > > >
> > > > > I repeated again, this project is not supposed to be a high-grade
> > > > > distributed file system with full file protocol support.
> > > > > This project should primarily focus on target users, which would
> like
> > > to
> > > > > use it without complex preparation.
> > > > >
> > > > > As for example, a user can deploy Ignite with such storage and
> > > web-server
> > > > > with REST API as Ignite service and get scalable, performant image
> > > server
> > > > > out of the box which can be accessed using any programming
> language.
> > > > >
> > > > > As a far target goal, we should focus on storing and processing a
> > very
> > > > > large amount of the data like movies, streaming, which is the big
> > trend
> > > > > today.
> > > > >
> > > > > I would like to say special thanks to our community members Alexey
> > > > Stelmak
> > > > > and Dmitriy Govorukhin which significantly helped me to put
> together
> > > all
> > > > > pieces of that puzzle.
> > > > >
> > > > > So, I want to hear your opinions about this proposal.
> > > > >
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message