tinkerpop-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stephen Mallette <spmalle...@gmail.com>
Subject Re: [DISCUSS] Null Handling 3.5.x
Date Tue, 04 Jun 2019 12:04:28 GMT
For further discussion/planning:

https://issues.apache.org/jira/browse/TINKERPOP-2235

UNSET is interesting. Perhaps it's too specific to cassandra for us to
include in TinkerPop though. I'd be curious if there are other graph
systems that are not backed by cassandra that might make use of such a
feature.

On Mon, Jun 3, 2019 at 6:21 AM Jorge Bay Gondra <jorgebaygondra@gmail.com>
wrote:

> I think having a null literal makes sense. It plays well with existent SQL
> providers, where there are representations for null:
> https://docs.microsoft.com/en-us/dotnet/api/system.dbnull.value
>
> I would propose another literal: UNSET. On some db providers, there's a
> distinction between NULL, that causes a write to occur (overwriting
> existing value), and UNSET, which is ignored at write time.
>
> See UNSET ticket in Cassandra:
> https://issues.apache.org/jira/browse/CASSANDRA-7304
>
> On Mon, Jun 3, 2019 at 10:49 AM Dmitry Novikov <dmitry.novikov@neueda.com>
> wrote:
>
> > Hello Stephen,
> >
> > Sounds like a great idea!
> >
> > One more use case, returning `null` object in case property does not
> exist:
> >
> > g.V().limit(1).coalesce(values('notSureIfExists'),
> > constant(Null.instance()))
> >
> > This would be very useful when working with steps that may fail on not
> > existing value. For example `project` step:
> >
> > gremlin> g.V().limit(1).project('a',
> > 'b').by(values('name')).by(values('notSureIfExists'))
> > The provided traverser does not map to a value:
> > v[1]->[PropertiesStep([notSureIfExists],value)]
> >
> > Could be improved:
> >
> > gremlin> g.V().limit(1).project('a',
> > 'b').by(values('name')).by(coalesce(values('notSureIfExists'),
> > constant(Null.instance())))
> > ==>[a:marko,b:null]
> >
> > `null` is better than any custom constant here, because it clearly
> > represents a missing value.
> >
> > To avoid necessity for `null` guards, it could be defined: "steps that
> > take `null` as input will produce `null` as output":
> >
> > gremlin> g.inject(Null.instance()).id()
> > ==> null
> > gremlin> g.inject(Null.instance()).math("_ + 1")
> > ==> null
> > gremlin> g.inject(Null.instance()).properties().as('a').key()
> > ==> null
> >
> > I see two approaches how to handle `null` in aggregation steps:
> >
> > Steps like `max`, `count`... may either fail on `null` object, requiring
> > to use predicate:
> >
> > gremlin> g.inject(1).inject(Null.instance()).inject(3).max()
> > Max step does not work with `null` values
> > gremlin>
> >
> g.inject(1).inject(Null.instance()).inject(3).is(neq(Null.instance())).max()
> > ==>3
> >
> > Alternatively exclude `null` values from calculation:
> >
> > gremlin> g.inject(1).inject(Null.instance()).inject(3).max()
> > ==>3
> > gremlin> g.inject(1).inject(Null.instance()).inject(3).count()
> > ==>2
> >
> > On 2019/05/31 17:01:34, Stephen Mallette <spmallette@gmail.com> wrote:
> > > I just spent some time fixing:
> > >
> > > https://issues.apache.org/jira/browse/TINKERPOP-2099
> > >
> > > which dealt with inconsistencies in null handling for property() step
> > when
> > > there is a null value. That's all nice now, but null handling still
> isn't
> > > so good overall. It's generally inconsistent in how it behaves in a
> > variety
> > > of uses in Gremlin - here's a couple examples:
> > >
> > > gremlin> g.inject(null)
> > > java.lang.NullPointerException
> > > Type ':help' or ':h' for help.
> > > Display stack trace? [yN]n
> > > gremlin> g.V().constant(null)
> > > gremlin>
> > >
> > > I've also heard the concern on several occasions that mutation
> traversals
> > > are often difficult to write when you want to remove a property and
> > update
> > > others at the same time, because it forces you into conditional logic
> > where
> > > you have to somehow work in a side effect of property("name").drop() as
> > > opposed to just inlining property('name',null).
> > >
> > > I think we should be a bit more respectful of the concept of null with
> > > Gremlin and while we probably shouldn't allow a literal null into the
> > > traversal stream, it seems like we could provide for our own Null class
> > > that could be used in it's place where users/providers needed it, so
> that
> > > we could do:
> > >
> > > gremlin> g.inject(Null.instance())
> > > ==> null
> > > gremlin> g.V(1).property("x", 1).property("y",
> > > Null.instance()).property("z", 2))
> > > ==> v[1]
> > >
> > > Perhaps we'd add a new Graph.Feature to allow providers to specify how
> > they
> > > handle such things. Taking this approach creates a position where we
> > aren't
> > > really changing core engine behavior. Instead, we're just adding a
> marker
> > > that can be used by providers/Gremlin to identify the notion of null
> and
> > > then updating serialization/GLVs to support it.
> > >
> > > Haven't thought much past that point. Any other implications of taking
> > this
> > > direction?
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message