tinkerpop-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marko Rodriguez <okramma...@gmail.com>
Subject mm-ADT References Solved
Date Tue, 11 Jun 2019 20:53:38 GMT
Hi,

mm-ADT is moving along nicely. I recently overcame a pretty nasty hurdle — references.

mm-ADT supports a reference type. In mm-ADT-bc (the mm-ADT bytecode language), a reference
is specified as follow.

[db][define,person,[name:@string,age:(@int|@string)]]
    [insert,people,
      [person[name:marko,   age:29]
       person[name:kuppitz, age:littlegirl]
       ...
       person[...]]
    [values,people,~person*![[db][values,people]]]
      
The last instruction is your basic “get(key)”. However, there is a small addition. All
data access instructions take an optional @reference. The reference I passed in was:

~person*![[db][values,people]]

This thing says:

	“I’m a reference ~ to a zero or more * person objects. In order to dereference me, please
use attached [[db][values,people]] bytecode."

This is what I (the compiler) know about this “people-table” filled with person objects.
However, because [values] is a storage system access, the storage system can take the reference
and sup it up. For instance, the output of [values] may contain:

~person[age:!lt(85){456}![[db][values,people]]
  -> [has,name,eq,$x.@string]     => ~person[name:x]?
  -> [dedup,name]                 => ![[noop]]
  -> [order,name,asc,(age,desc)?] => ![[noop]]

Cool. The storage system really sup’d up my reference. What new information did I get?

	1. The storage system knows that no one is older than 85.
	2. The storage system knows there are 456 person records in the "people-table.”
	3. The storage system is saying that it has an index on name.
	4. The storage system is saying that names are unique.
	5. The storage system is saying the the person records are sorted by name (w/ ties broken
by age).

Lets go back to our original bytecode.

...
[values,people,~person*![[db][values,people]]]
[dedup,name]
[has,name,eq,marko]

If [dedup] and [has] are the next instructions, guess what, [dedup] matches the [dedup,name]
instruction on the sup’d up ~person so the virtual machine does nothing (no-op). Then the
next [has] instruction does require me to dereference the ~person with the [[db][values,people]]
dereference bytecode. Nope, instead my ~person maps to a new ~person?. This is an index lookup
and the question make says there is 0 or 1 referent for this reference. Again, the storage
system knows that the “people-table” person objects have unique names (schema inference
lets say).

————

Anywho, the big problem I solve was “how do you mechanically dereference a reference?”

The answer: bytecode.

Check this simple example out.

[db][define,project,[title:@string]]
    [define,person,[name:@string,project:@project]]
    [define,tp,project[title:tinkerpop]]
    [insert,projects,tp]
    [insert,people,
      [person[name:marko,   project:tp]
       person[name:kuppitz, project:tp]]

Okay, so our [db] is looking something like this:



Looks good so far. Now what happens when we do the following:

[db][values,people]
    [has,name,eq,marko]
    [value,project]
    [insert,lang,java]



Doh! Pass by value. So how do we solve this generally? Well, with references(pointers) of
course. The problem I was facing was how does a vendor manage mm-ADT references. That is asking
a lot of them. Then it came to me, the reference should manage its own path to dereferencing.
Bytecode!

[db][define,project,[title:@string]]
    [define,person,[name:@string,project:@project]]
    [define,tp,     project[title:tinkerpop]]
    [define,tp-ref, ~project[title:tinkerpop]![[db][values,projects][has,title,eq,tinkerpop]]]
    [insert,projects,tp,tp-ref]
    [insert,people,
      [person[name:marko,   project:tp-ref]
       person[name:kuppitz, project:tp-ref]]



Sweet so far. How about when we mutate?

[db][values,people]
    [has,name,eq,marko]
    [value,project]
    [insert,lang,java]



Tada! So two things.

	1. The tp-ref contains enough information for it to always be able to dereference to the
same logical object.
		~project[title:tinkerpop]![[db][values,projects][has,title,eq,tinkerpop]]]
	2. However, if the vendor says: “don’t use that inefficient, index-based data access
path! I support object pointers (ala graph databases),” the vendor can change the dereference
bytecode to leverage their internal pointers.
		- Thus, it works regardless. RDBMS will be index lookups and joins. Graph databases will
be direct pointers.

Finally, you might wonder the mechanics of whats happening. Here is each instruction in the
mutation block and what object is outgoing from it.

1. [has,name,eq,marko] ==>
         person[name:marko,project:~project[title:tinkerpop]![[db][values,projects][has,title,eq,tinkerpop]]]]
2. [value,project] ==>
         ~project[title:tinkerpop]![[db][values,projects][has,title,eq,tinkerpop]]]
3. [insert,lang,java] ==>
          // NO WAY TO SOLVE THIS INSTRUCTION WITH THE REFERENCE DATA. WE NEED TO DEREFERENCE.
4. [map,![[db][values,projects][has,title,eq,tinkerpop]]] ==>
         project[title:tinkerpop]
5. [insert,lang,java] ==>
         project[title:tinkerpop,lang:java]

Pretty neat, eh? And in terms of persistence, mm-ADT bytecode will have both a text and binary
representation and thus, mm-ADT references will be able to be stored within any database (assuming
they don’t support their own object references).

References are solved. The solution:

 the reference carries around the chunk of bytecode that contains the data access path to
its referent.

Oh, and two quickies….

	1. References can be shipped over the wire without having to be re-synchornized (the bytecode’s
address space is the logical model, not the physical model).
	2. References can reference objects not on the same machine as them (again, bytecode address
space is not machine space).
	
*** IF THE GRAPHICS DIDN’T COME THROUGH, HERE IS A LINK TO THEM:
	https://gist.github.com/okram/159a3652672cb15a4ea7184e1258ba6d <https://gist.github.com/okram/159a3652672cb15a4ea7184e1258ba6d>

Marko.

http://rredux.com <http://rredux.com/>





Mime
View raw message