ignite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pavel Tupitsyn <ptupit...@apache.org>
Subject Re: Ignite 3.0 IgniteTables API Improvement Suggestion
Date Thu, 01 Jul 2021 09:25:30 GMT
Val, agreed.
Let's add length(), value(index), and Iterable to the Tuple interface.

I've updated the ticket: https://issues.apache.org/jira/browse/IGNITE-14342

On Wed, Jun 30, 2021 at 11:17 PM Valentin Kulichenko <
valentin.kulichenko@gmail.com> wrote:

> Pavel,
>
> Thanks for your response, makes sense.
>
> Regarding the values() method: I would instead add the required methods to
> the Tuple itself. E.g., it can implement Iterable, and additionally have
> the value(int index) method, plus anything else that we might need.
> I don't like returning a collection, because in Java it's a much wider API.
> We will end up returning an implementation that throws
> UnsupportedOperationException for most of the methods. I know this approach
> is not uncommon in the Java world, but this doesn't make it good.
> In .NET and other languages, we can use other abstractions, of course.
> Every platform has its own specifics and practices, so APIs don't have to
> be identical.
>
> -Val
>
> On Wed, Jun 30, 2021 at 7:44 AM Pavel Tupitsyn <ptupitsyn@apache.org>
> wrote:
>
> > Hi Andrey,
> >
> > > This will force us to bother about interfaces/contracts and
> compatibility
> > > aspects in the future
> >
> > Schemas and versions are part of thin client wire protocol.
> > This protocol is a public API - we'll have to care about compatibility
> > anyway.
> >
> > Schema evolution is an important feature that users should understand.
> > I see no reason to hide it from the API.
> >
> >
> > > We already have a ticket [1]...
> > > Will 'idx -> column' mapping only be enough for your purposes?
> >
> > The ticket sounds reasonable, but there are no API details.
> > At the very least we need public access to column names, types, and
> > indexes.
> > I propose something like this:
> >
> > interface Tuple {
> >     TupleSchema getSchema();
> > }
> >
> > class TupleSchema {
> >     int getVersion();
> >     List<TupleSchemaColumn> getColumns();
> > }
> >
> > class TupleSchemaColumn {
> >     int index;
> >     String name;
> >     String typeName;
> >     bool isKey;
> > }
> >
> > On Wed, Jun 30, 2021 at 11:05 AM Andrey Mashenkov <
> > andrey.mashenkov@gmail.com> wrote:
> >
> > > Hi Pavel,
> > >
> > > 2. Schema and its version are internal things and shouldn't be exposed,
> > > otherwise, eventually, it will lead to the user will manage schemas
> > > manually on his side for some purposes.
> > > This will force us to bother about interfaces/contracts and
> compatibility
> > > aspects in the future with uncertain benefits for a user.
> > >
> > > As SchemaDescriptor is an internal API and the user expects a public
> API.
> > > I don't think we want to convert the descriptor back into public API
> > terms
> > > and it may be impossible for some data.
> > >
> > > We already have a ticket [1] to support accessing a column by an index
> > and
> > > exposing some metadata.
> > > Will 'idx -> column' mapping only be enough for your purposes?
> > >
> > > > int Tuple.columnIndex(String)
> > >
> > >
> > >
> > >  [1] https://issues.apache.org/jira/browse/IGNITE-14342
> > >
> > > On Wed, Jun 30, 2021 at 9:34 AM Pavel Tupitsyn <ptupitsyn@apache.org>
> > > wrote:
> > >
> > > > Hi Val,
> > > >
> > > > Thanks for the comments, please see below:
> > > >
> > > >
> > > > > Users will identify tables using names, they will never use IDs
> > > >
> > > > Ok, let's keep it this way.
> > > >
> > > >
> > > > > Sounds like the Tuple should implement Iterable.
> > > >
> > > > I don't think Iterable is enough.
> > > > We should have a way to get values by column index: Tuple.value(int
> > > index),
> > > > where index is according to column order in schema.
> > > >
> > > > Combined with Tuple.schema(), it will provide an efficient way to
> > consume
> > > > data -
> > > > users will be able to read columns in any order, skip some of them,
> > etc.
> > > >
> > > > This is a common pattern in APIs like ODBC or System.Data [1],
> > > > which we'll need to implement on top of our thin clients later.
> > > >
> > > >
> > > > > However, whether a user might
> > > > > need to get a table schema for a particular version, I'm not sure.
> Do
> > > you
> > > > > have a use case in mind for this? If not, I would keep this
> internal
> > > >
> > > > Ok, we can keep the Table.schema(ver) method internal, as long as
> > > > Table.schemas() is public and includes schema versions.
> > > >
> > > >
> > > > > We already have async counterparts for all the methods where this
> is
> > > > applicable
> > > >
> > > > IgniteTables.createTable(), dropTable(), tables(), table() do not
> have
> > > > async counterparts,
> > > > while internally they perform blocking wait on some futures.
> > > >
> > > >
> > > > [1]
> > > >
> > > >
> > >
> >
> https://docs.microsoft.com/en-us/dotnet/api/system.data.idatareader?view=net-5.0
> > > >
> > > > On Wed, Jun 30, 2021 at 4:51 AM Valentin Kulichenko <
> > > > valentin.kulichenko@gmail.com> wrote:
> > > >
> > > > > Hi Pavel,
> > > > >
> > > > > Please see my comments below.
> > > > >
> > > > > -Val
> > > > >
> > > > > On Tue, Jun 29, 2021 at 2:23 PM Pavel Tupitsyn <
> ptupitsyn@apache.org
> > >
> > > > > wrote:
> > > > >
> > > > > > Igniters,
> > > > > >
> > > > > > While working on "IEP-76 Thin Client Protocol for Ignite 3.0"
[1]
> > (to
> > > > be
> > > > > > discussed separately), the following suggestions for the Table
> API
> > > came
> > > > > up:
> > > > > >
> > > > > > 1. Expose table IDs: sending table name with every operation
> (e.g.
> > > GET)
> > > > > is
> > > > > > inefficient, string serialization is expensive by itself and
> names
> > > can
> > > > be
> > > > > > long.
> > > > > >     - Table.id()
> > > > > >     - IgniteTables.table(UUID)
> > > > > >     - IgniteTables.dropTable(UUID)
> > > > > >
> > > > >
> > > > > I don't think this should be a part of the public API. Users will
> > > > identify
> > > > > tables using names, they will never use IDs. As an internal
> > > optimization
> > > > > though - sure, we can have that. Sounds similar to the cache ID in
> > 2.x.
> > > > >
> > > > >
> > > > > >
> > > > > > 2. Expose tuple schemas: to reduce payload size when sending
> tuples
> > > to
> > > > > the
> > > > > > client, we'll write only the schema version and column values,
> then
> > > the
> > > > > > client can retrieve and cache schemas (ordered set of columns
per
> > > > > version).
> > > > > >     - Tuple.schema()
> > > > > >     - Table.schemas()
> > > > > >     - Table.schema(ver)
> > > > > >
> > > > >
> > > > > Exposing the schema of a tuple makes sense. However, whether a user
> > > might
> > > > > need to get a table schema for a particular version, I'm not sure.
> Do
> > > you
> > > > > have a use case in mind for this? If not, I would keep this
> internal
> > as
> > > > > well.
> > > > >
> > > > >
> > > > > >
> > > > > > 3. Expose tuple values as a collection: to serialize tuples
> > > efficiently
> > > > > > (see p2) we need an API to get all values at once. Right now
the
> > only
> > > > API
> > > > > > is to get values by column name, which involves a HashMap lookup
> on
> > > > every
> > > > > > call.
> > > > > >     - Tuple.values()
> > > > > >
> > > > >
> > > > > Sounds like the Tuple should implement Iterable.
> > > > >
> > > > >
> > > > > >
> > > > > > 4. Simplify createTable API: use POJO-based configuration.
> > > > > >     Creating a Consumer when some properties are optional seems
> to
> > be
> > > > > > non-trivial.
> > > > > >
> > > > >
> > > > > Yes, it's currently convoluted for sure. To my knowledge, there are
> > > plans
> > > > > to improve this, but I will let other folks chime in with more
> > details.
> > > > >
> > > > >
> > > > > >
> > > > > > 5. Add async methods to IgniteTables API (all methods are async
> > > inside
> > > > > > anyway)
> > > > > >
> > > > >
> > > > > We already have async counterparts for all the methods where this
> is
> > > > > applicable. E.g., for TableView#get, there is TableView#getAsync.
> Is
> > > > there
> > > > > anything else that you propose to add?
> > > > >
> > > > >
> > > > > >
> > > > > >
> > > > > > Thoughts, objections?
> > > > > >
> > > > > >
> > > > > > [1]
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/IGNITE/IEP-76+Thin+Client+Protocol+for+Ignite+3.0
> > > > > >
> > > > > > [2]
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/IGNITE/IEP-54%3A+Schema-first+Approach
> > > > > >
> > > > >
> > > >
> > >
> > >
> > > --
> > > Best regards,
> > > Andrey V. Mashenkov
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message