ignite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vladimir Ozerov <voze...@gridgain.com>
Subject Re: CacheEntry serialization failure
Date Mon, 21 Dec 2015 12:57:26 GMT
Several additional problems appeared:
1) We must use fully-qualified class names because simple class names might
also be the same. E.g: class package1.A extends package2.A
2) Our query engine doesn't like dots and "$" from class names. Because of
this fields with these characters cannot be queried.

To workaorund these problems I did the following:
1) Fully qualified class names are used;
2) "." is replaced with "_" and "$" is replaced with "__". E.g. field
org.my.package.MyClass$MyInnerClass.x is written as "
org_my_package_MyClass__MyInnerClass_x";
3) If user would like to query the field, he can call special helper
method BinaryUtils.qualifiedFieldName(Class
cls, String fieldName) returning correct field name.

As this problem is not likely to occur in real quering scenarios, I think
this solution is more or less fine.

Thoughts?


On Mon, Dec 21, 2015 at 12:28 PM, Sergey Kozlov <skozlov@gridgain.com>
wrote:

> From my standpoint Vova's appoach very close to SQL behavior if two joined
> tables have same column names.
>
> On Mon, Dec 21, 2015 at 12:23 PM, Dmitriy Setrakyan <dsetrakyan@apache.org
> >
> wrote:
>
> > Got it. I understand the reasoning now for not throwing an exception.
> Let’s
> > make sure we document this behavior.
> >
> > If there is no additional field-name-parsing overhead, then the proposed
> > API looks very nice.
> >
> > D.
> >
> > On Mon, Dec 21, 2015 at 1:19 AM, Vladimir Ozerov <vozerov@gridgain.com>
> > wrote:
> >
> > > Dima,
> > >
> > > Here is how proposed design works:
> > >
> > > class A {
> > >     int x = 1;
> > > }
> > > class B {
> > >     int x = 2;
> > > }
> > >
> > > BinaryObject obj = ...;
> > >
> > > Object val = obj.field("A.x"); // Returns "1";
> > > Object val = obj.field("B.x"); // Returns "2";
> > > Object val = obj.field("x"); // Returns null;
> > >
> > > boolean exists = obj.hasField("A.x"); // Returns "true";
> > > boolean exists = obj.hasField("B.x"); // Returns "true";
> > > boolean exists = obj.hasField("x"); // Returns "false";
> > >
> > > Looks clean and consistent for me. Remember that we are talking about
> > very
> > > specific use case. It is very unlikely that user will operate on
> objects
> > > conflicting fields in binary form.
> > > Also, there will be no parsing at all. We use field name passed by user
> > > directly.
> > >
> > > Vladimir.
> > >
> > > On Mon, Dec 21, 2015 at 12:10 PM, Dmitriy Setrakyan <
> > dsetrakyan@apache.org
> > > >
> > > wrote:
> > >
> > > > Vova,
> > > >
> > > > We cannot return null in case of a conflict, as user won’t be able to
> > > > differentiate between a conflict and missing field. We should throw
> an
> > > > exception.
> > > >
> > > > Also, I don’t like parsing field names looking for a dot for every
> > field.
> > > > It will introduce a performance overhead for the cases that do not
> have
> > > > conflicts. Instead, we should add another API for this use case,
> > > something
> > > > like field(typeName, fieldName).
> > > >
> > > > D.
> > > >
> > > > On Mon, Dec 21, 2015 at 1:01 AM, Vladimir Ozerov <
> vozerov@gridgain.com
> > >
> > > > wrote:
> > > >
> > > > > Denis,
> > > > > Yes, as we do not know which field to pick, we return null.
> > > > >
> > > > > On Mon, Dec 21, 2015 at 12:00 PM, Denis Magda <dmagda@gridgain.com
> >
> > > > wrote:
> > > > >
> > > > > > Sounds good for me. I would go for this approach.
> > > > > >
> > > > > > In addition if to consider your example below and the user
> decides
> > to
> > > > > look
> > > > > > up a field by its simple name then he/she will get nothing or
> > > exception
> > > > > > (depending on the API), correct?
> > > > > > As an example for this case the method will return null
> > > > > >
> > > > > > BinaryObject obj = ...;
> > > > > > Object val = obj.field("x"); // null will be returned cause
we
> > don't
> > > > know
> > > > > > what particular 'x' we have to return
> > > > > >
> > > > > >
> > > > > > --
> > > > > > Denis
> > > > > >
> > > > > > On 12/21/2015 11:48 AM, Vladimir Ozerov wrote:
> > > > > >
> > > > > >> Folks,
> > > > > >>
> > > > > >> I thought about the solution a bit more and came to the
> following
> > > > > design.
> > > > > >>
> > > > > >> *Scenario:*
> > > > > >> class A {
> > > > > >>      int x;
> > > > > >> }
> > > > > >> class B : extends A {
> > > > > >>      int y;
> > > > > >> }
> > > > > >>
> > > > > >> *Solution:*
> > > > > >> 1) Field A.x is written as *"A.x"*, field B.x is written
as
> > *"B.x"*.
> > > > > I.e.
> > > > > >> *both
> > > > > >> conflicting fields are prefixed* with simple name of the
owning
> > > class.
> > > > > >> 2) API is unchanged. User manipulates these fields on all
public
> > > > methods
> > > > > >> in
> > > > > >> exactly the same way: "A.x" and "B.x".
> > > > > >>
> > > > > >> *Rationale:*
> > > > > >> 1) We cannot prefix only some of conflicting fields. E.g.
if
> > decide
> > > to
> > > > > >> prefix only A.x, then it is not clear how to handle this
case:
> > > > > >>
> > > > > >> class B extends A implements Binarylizable {
> > > > > >>      void write(BinaryWriter writer) {
> > > > > >>          writer.writeInt("B.x", x); // User intentionally
> written
> > > > field
> > > > > as
> > > > > >> "B.x".
> > > > > >>      }
> > > > > >> }
> > > > > >>
> > > > > >> BinaryObject obj = ...;
> > > > > >> Object val = obj.field("B.x"); // Should we lookup for "B.x"
as
> > user
> > > > > asked
> > > > > >> us, or just for "x"?
> > > > > >>
> > > > > >> Prefixing all conflicting fields with class name resolves
the
> > > problem.
> > > > > >>
> > > > > >> 2) If we add methods to manipulate fields not only by name,
but
> by
> > > > > >> (typeName + fieldName) as well, then we will have to add
*9 new
> > > > methods*
> > > > > >> to
> > > > > >>
> > > > > >> API:
> > > > > >> BinaryType.fieldTypeName(String typeName, String fieldName);
> > > > > >> BinaryType.field(String typeName, String fieldName);
> > > > > >> BinaryObject.field(String typeName, String fieldName);
> > > > > >> BinaryObject.hasField(String  typeName, String fieldName);
> > > > > >> BinaryObjectBuilder.getField(String typeName, String fieldName);
> > > > > >> BinaryObjectBuilder.setField(String typeName, String fieldName,
> > > ...);
> > > > > // 3
> > > > > >> overloads
> > > > > >> BinaryObjectBuilder.removeField(String typeName, String
> > fieldName);
> > > > > >>
> > > > > >> This is definitely an overkill for such a rare scenario.
> > > > > >>
> > > > > >> Thoughts?
> > > > > >>
> > > > > >> On Mon, Dec 21, 2015 at 11:22 AM, Vladimir Ozerov <
> > > > vozerov@gridgain.com
> > > > > >
> > > > > >> wrote:
> > > > > >>
> > > > > >> Agree, prefixing parent class fields sound like a better
option.
> > > > > >>> Regarding aliases - I need some time to understand internal
> > > > mechanics.
> > > > > >>> Will answer this a bit later.
> > > > > >>>
> > > > > >>> On Mon, Dec 21, 2015 at 11:18 AM, Dmitriy Setrakyan
<
> > > > > >>> dsetrakyan@apache.org
> > > > > >>>
> > > > > >>>> wrote:
> > > > > >>>> Vova,
> > > > > >>>>
> > > > > >>>> Shouldn’t it be the other way around? Class B
writes the field
> > as
> > > > “a”,
> > > > > >>>> and
> > > > > >>>> class A writes it with a prefix (possibly the hash
code of the
> > > class
> > > > > >>>> name).
> > > > > >>>>
> > > > > >>>> Also, we should clearly document how the SQL queries
are
> > affected
> > > by
> > > > > >>>> this.
> > > > > >>>> AFAIK, we should be using field aliases here, no?
> > > > > >>>>
> > > > > >>>> D.
> > > > > >>>>
> > > > > >>>> On Sun, Dec 20, 2015 at 10:08 AM, Vladimir Ozerov
<
> > > > > vozerov@gridgain.com
> > > > > >>>> >
> > > > > >>>> wrote:
> > > > > >>>>
> > > > > >>>> May be we can use normal field names by default
and add some
> > > > > >>>>>
> > > > > >>>> prefix/suffix
> > > > > >>>>
> > > > > >>>>> if conflict is found? E.g.:
> > > > > >>>>>
> > > > > >>>>> class A {
> > > > > >>>>>      int a; // Write as "a";
> > > > > >>>>> }
> > > > > >>>>>
> > > > > >>>>> class B extends A {
> > > > > >>>>>      int a; // Write as "B_a";
> > > > > >>>>> }
> > > > > >>>>>
> > > > > >>>>> On Fri, Dec 18, 2015 at 10:34 PM, Valentin Kulichenko
<
> > > > > >>>>> valentin.kulichenko@gmail.com> wrote:
> > > > > >>>>>
> > > > > >>>>> Folks,
> > > > > >>>>>>
> > > > > >>>>>> It seems to me that issue here is not with
3rd-party
> > libraries.
> > > We
> > > > > >>>>>>
> > > > > >>>>> just
> > > > > >>>>
> > > > > >>>>> don't properly support class hierarchy in binary
format. Any
> > > class
> > > > > >>>>>>
> > > > > >>>>> that
> > > > > >>>>
> > > > > >>>>> extends another class and has the field with
the same name as
> > > > parent
> > > > > >>>>>>
> > > > > >>>>> has
> > > > > >>>>
> > > > > >>>>> will fail unless user provides custom serialization
logic
> that
> > > will
> > > > > >>>>>>
> > > > > >>>>> handle
> > > > > >>>>>
> > > > > >>>>>> it.
> > > > > >>>>>>
> > > > > >>>>>> What if we prepend the field name with the
simple class name
> > in
> > > > this
> > > > > >>>>>>
> > > > > >>>>> case?
> > > > > >>>>>
> > > > > >>>>>> Say, we have two classes:
> > > > > >>>>>>
> > > > > >>>>>> class A {
> > > > > >>>>>>    private int id;
> > > > > >>>>>> }
> > > > > >>>>>>
> > > > > >>>>>> class B extends A {
> > > > > >>>>>>    private int id;
> > > > > >>>>>> }
> > > > > >>>>>>
> > > > > >>>>>> In this case we will get two fields: "A.id"
and "B.id". The
> > only
> > > > > >>>>>>
> > > > > >>>>> issue is
> > > > > >>>>
> > > > > >>>>> that if there are no name conflict, we should
be able to
> > resolve
> > > by
> > > > > >>>>>>
> > > > > >>>>> both
> > > > > >>>>
> > > > > >>>>> names - with or without prepended type name.
I.e., if A is
> > > > > serialized,
> > > > > >>>>>>
> > > > > >>>>> you
> > > > > >>>>>
> > > > > >>>>>> can get the field value by "id" or "A.id".
This is similar
> to
> > > how
> > > > it
> > > > > >>>>>>
> > > > > >>>>> works
> > > > > >>>>>
> > > > > >>>>>> if you join two SQL tables with the same
column names.
> > > > > >>>>>>
> > > > > >>>>>> Any thoughts on whether it's doable or not?
> > > > > >>>>>>
> > > > > >>>>>> -Val
> > > > > >>>>>>
> > > > > >>>>>> On Fri, Dec 18, 2015 at 10:05 AM, Andrey
Kornev <
> > > > > >>>>>>
> > > > > >>>>> andrewkornev@hotmail.com>
> > > > > >>>>>
> > > > > >>>>>> wrote:
> > > > > >>>>>>
> > > > > >>>>>> In this particular case, the class that
fails is a
> non-static
> > > > inner
> > > > > >>>>>>>
> > > > > >>>>>> class
> > > > > >>>>>
> > > > > >>>>>> that extends another non-static inner class,
so they both
> end
> > up
> > > > > >>>>>>>
> > > > > >>>>>> having
> > > > > >>>>
> > > > > >>>>> the
> > > > > >>>>>>
> > > > > >>>>>>> compiler-generated "this$0" field.
> > > > > >>>>>>>
> > > > > >>>>>>> Regards
> > > > > >>>>>>> Andrey
> > > > > >>>>>>>
> > > > > >>>>>>> Date: Fri, 18 Dec 2015 20:44:12 +0300
> > > > > >>>>>>>> Subject: Re: CacheEntry serialization
failure
> > > > > >>>>>>>> From: vozerov@gridgain.com
> > > > > >>>>>>>> To: dev@ignite.apache.org
> > > > > >>>>>>>>
> > > > > >>>>>>>> The most straightforward solution
which comes to my mind -
> > *do
> > > > not
> > > > > >>>>>>>>
> > > > > >>>>>>> ever
> > > > > >>>>>
> > > > > >>>>>> use
> > > > > >>>>>>>
> > > > > >>>>>>>> BinaryMarshaller by default*. Always
fallback to
> > > > > >>>>>>>>
> > > > > >>>>>>> OptimizedMarshaller
> > > > > >>>>
> > > > > >>>>> unless
> > > > > >>>>>>>
> > > > > >>>>>>>> user explicitly asked us to use
binary format (e.g.
> through
> > > > > >>>>>>>>
> > > > > >>>>>>> package
> > > > > >>>>
> > > > > >>>>> wildcards).
> > > > > >>>>>>>>
> > > > > >>>>>>>> BTW, we already do this for Externalizable
and
> > > > > >>>>>>>>
> > > > > >>>>>>> readObject/writeObject.
> > > > > >>>>>
> > > > > >>>>>> On Fri, Dec 18, 2015 at 8:41 PM, Vladimir
Ozerov <
> > > > > >>>>>>>>
> > > > > >>>>>>> vozerov@gridgain.com
> > > > > >>>>>
> > > > > >>>>>> wrote:
> > > > > >>>>>>>>
> > > > > >>>>>>>> Ah, I saw your problem with DirectedSpecifics.
We need to
> > > think
> > > > > >>>>>>>>>
> > > > > >>>>>>>> about
> > > > > >>>>>
> > > > > >>>>>> how
> > > > > >>>>>>>
> > > > > >>>>>>>> to solve it. Here is the case:
> > > > > >>>>>>>>> 1) Class is Serilzable and cannot
be changed;
> > > > > >>>>>>>>> 2) There are several duplicated
field names;
> > > > > >>>>>>>>> => BinaryMarshaller cannot
handle it.
> > > > > >>>>>>>>>
> > > > > >>>>>>>>> Any thoughts?
> > > > > >>>>>>>>>
> > > > > >>>>>>>>> On Fri, Dec 18, 2015 at 8:34
PM, Vladimir Ozerov <
> > > > > >>>>>>>>>
> > > > > >>>>>>>> vozerov@gridgain.com
> > > > > >>>>>>
> > > > > >>>>>>> wrote:
> > > > > >>>>>>>>>
> > > > > >>>>>>>>> I fixed the problem, it was
a bug actually.
> > > > > >>>>>>>>>>
> > > > > >>>>>>>>>> By default classes which
has some custom Java logic
> (e.g.
> > > > > >>>>>>>>>>
> > > > > >>>>>>>>> Externalizable,
> > > > > >>>>>>>
> > > > > >>>>>>>> or with writeObject/readObject methods)
will be written
> > using
> > > > > >>>>>>>>>> OptimizedMarshaller, so
similar field names is not a
> > > problem.
> > > > > >>>>>>>>>> If you want to serialize
such class in binary format and
> > > have
> > > > > >>>>>>>>>>
> > > > > >>>>>>>>> duplicate
> > > > > >>>>>>>
> > > > > >>>>>>>> field names, you should provide
your own BinarySerializer,
> > > > > >>>>>>>>>>
> > > > > >>>>>>>>> which
> > > > > >>>>
> > > > > >>>>> will
> > > > > >>>>>>
> > > > > >>>>>>> write
> > > > > >>>>>>>
> > > > > >>>>>>>> these fields with different names.
> > > > > >>>>>>>>>>
> > > > > >>>>>>>>>> On Fri, Dec 18, 2015 at
8:07 PM, Andrey Kornev <
> > > > > >>>>>>>>>>
> > > > > >>>>>>>>> andrewkornev@hotmail.com>
> > > > > >>>>>>>
> > > > > >>>>>>>> wrote:
> > > > > >>>>>>>>>>
> > > > > >>>>>>>>>> How am I supposed to handle
this situation if the class
> > > comes
> > > > > >>>>>>>>>>>
> > > > > >>>>>>>>>> from
> > > > > >>>>>
> > > > > >>>>>> a
> > > > > >>>>>>
> > > > > >>>>>>> 3d
> > > > > >>>>>>>
> > > > > >>>>>>>> party I can't modify?
> > > > > >>>>>>>>>>>
> > > > > >>>>>>>>>>> Thanks
> > > > > >>>>>>>>>>> Andrey
> > > > > >>>>>>>>>>>
> > > > > >>>>>>>>>>> Date: Fri, 18 Dec 2015
09:12:22 +0300
> > > > > >>>>>>>>>>>> Subject: Re: CacheEntry
serialization failure
> > > > > >>>>>>>>>>>> From: vozerov@gridgain.com
> > > > > >>>>>>>>>>>> To: dev@ignite.apache.org
> > > > > >>>>>>>>>>>>
> > > > > >>>>>>>>>>>> I'll take a look.
> > > > > >>>>>>>>>>>>
> > > > > >>>>>>>>>>>> On Fri, Dec 18,
2015 at 4:37 AM, Valentin Kulichenko <
> > > > > >>>>>>>>>>>> valentin.kulichenko@gmail.com>
wrote:
> > > > > >>>>>>>>>>>>
> > > > > >>>>>>>>>>>> Folks,
> > > > > >>>>>>>>>>>>>
> > > > > >>>>>>>>>>>>> It looks like
CacheEntry implementation (i.e., the
> > entry
> > > > > >>>>>>>>>>>>>
> > > > > >>>>>>>>>>>> that
> > > > > >>>>
> > > > > >>>>> contains
> > > > > >>>>>>>>>>>
> > > > > >>>>>>>>>>>> version) can't be
properly serialized with the
> > > > > >>>>>>>>>>>>>
> > > > > >>>>>>>>>>>> BinaryMarshaller.
> > > > > >>>>>>
> > > > > >>>>>>> I
> > > > > >>>>>>>
> > > > > >>>>>>>> created
> > > > > >>>>>>>>>>>
> > > > > >>>>>>>>>>>> the test and the
ticket:
> > > > > >>>>>>>>>>>>>
> > > > > >>>>>>>>>>>> https://issues.apache.org/jira/browse/IGNITE-2203
> > > > > >>>>>>>>>>>
> > > > > >>>>>>>>>>>> Can someone take
a look?
> > > > > >>>>>>>>>>>>>
> > > > > >>>>>>>>>>>>> -Val
> > > > > >>>>>>>>>>>>>
> > > > > >>>>>>>>>>>>>
> > > > > >>>>>>>>>>>
> > > > > >>>>>>>>>>
> > > > > >>>>>>>
> > > > > >>>
> > > > > >
> > > > >
> > > >
> > >
> >
>
>
>
> --
> Sergey Kozlov
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message