tinkerpop-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Daniel Kuppitz ...@gremlin.guru>
Subject Re: The Fundamental Structure Instructions Already Exist! (w/ RDBMS Example)
Date Mon, 29 Apr 2019 15:28:47 GMT
>
> we don’t support ‘null' in TP


I don't think it's a good idea to keep this mindset for TP4; NULLs are too
important in RDBMS. I don't know, maybe you can convince SQL people that
dropping a value is the same as setting its value to NULL. It would work
for you and me and everybody else who's familiar with Gremlin, but SQL
people really love their NULLs....

TSymbol: like Ruby, I think we need “enum-like” symbols (e.g., #id, #label).


I'd prefer to just have special accessors for these. E.g. g.V().meta("id").
At least valueMaps would then only have String-keys.
I see the issue with that (naming collisions), but it's still better than
the enums in my opinion (which became a pain when started to implement
GLVs).

Also, what I'm wondering about now: Have you thought about Stored
Procedures and Views in RDBMS? Views can be treated as tables, easy, but
what about stored procedures? SPs can be found in many more DBMS, would be
bad to not support them (or hack something ugly together later in the
development process).

Cheers,
Daniel


On Mon, Apr 29, 2019 at 7:34 AM Marko Rodriguez <okrammarko@gmail.com>
wrote:

> Hi,
>
> *** This email is primarily for Josh (and Kuppitz). However, if others are
> interested… ***
>
> So I did a lot of thinking this weekend about structure/ and this morning,
> I prototyped both graph/ and rdbms/.
>
> This is the way I’m currently thinking of things:
>
>         1. There are 4 base types in structure/.
>                 - Primitive: string, long, float, int, … (will constrain
> these at some point).
>                 - TTuple<K,V>: key/value map.
>                 - TSequence<V>: an iterable of v objects.
>                 - TSymbol: like Ruby, I think we need “enum-like” symbols
> (e.g., #id, #label).
>
>         2. Every structure has a “root.”
>                 - for graph its TGraph implements TSequence<TVertex>
>                 - for rdbms its a TDatabase implements
> TTuple<String,TTable>
>
>         3. Roots implement Structure and thus, are what is generated by
> StructureFactory.mint().
>                 - defined using withStructure().
>                 - For graph, its accessible via V().
>                 - For rdbms, its accessible via db().
>
>         4. There is a list of core instructions for dealing with these
> base objects.
>                 - value(K key): gets the TTuple value for the provided key.
>                 - values(K key): gets an iterator of the value for the
> provided key.
>                 - entries(): gets an iterator of T2Tuple objects for the
> incoming TTuple.
>                 - hasXXX(A,B): various has()-based filters for looking
> into a TTuple and a TSequence
>                 - db()/V()/etc.: jump to the “root” of the withStructure()
> structure.
>                 - drop()/add(): behave as one would expect and thus.
>
> ————
>
> For RDBMS, we have three interfaces in rdbms/.
> (machine/machine-core/structure/rdbms)
>
>         1. TDatabase implements TTuple<String,TTable> // the root
> structure that indexes the tables.
>         2. TTable implements TSequence<TRow<?>> // a table is a sequence
> of rows
>         3. TRow<V> implements TTuple<String,V>> // a row has string column
> names
>
> I then created a new project at machine/structure/jdbc). The classes in
> here implement the above rdbms/ interfaces/
>
> Here is an RDBMS session:
>
> final Machine machine = LocalMachine.open();
> final TraversalSource jdbc =
>         Gremlin.traversal(machine).
>                         withProcessor(PipesProcessor.class).
>                         withStructure(JDBCStructure.class,
> Map.of(JDBCStructure.JDBC_CONNECTION, "jdbc:h2:/tmp/test"));
>
> System.out.println(jdbc.db().toList());
> System.out.println(jdbc.db().entries().toList());
> System.out.println(jdbc.db().value("people").toList());
> System.out.println(jdbc.db().values("people").toList());
> System.out.println(jdbc.db().values("people").value("name").toList());
> System.out.println(jdbc.db().values("people").entries().toList());
>
> This yields:
>
> [<database#conn1: url=jdbc:h2:/tmp/test user=>]
> [PEOPLE:<table#PEOPLE>]
> [<table#people>]
> [<row#PEOPLE:1>, <row#PEOPLE:2>]
> [marko, josh]
> [NAME:marko, AGE:29, NAME:josh, AGE:32]
>
> The bytecode of the last query is:
>
> [db(<database#conn1: url=jdbc:h2:/tmp/test user=>), values(people),
> entries]
>
> JDBCDatabase implements TDatabase, Structure.
>         *** JDBCDatabase is the root structure and is referenced by db()
> *** (CRUCIAL POINT)
>
> Assume another table called ADDRESSES with two columns: name and city.
>
>
> jdbc.db().values(“people”).as(“x”).db().values(“addresses”).has(“name”,eq(path(“x”).by(“name”))).value(“city”)
>
> The above is equivalent to:
>
> SELECT city FROM people,addresses WHERE people.name=addresses.name
>
> If you want to do an inner join (a product), you do this:
>
>
> jdbc.db().values(“people”).as(“x”).db().values(“addresses”).has(“name”,eq(path(“x”).by(“name”))).as(“y”).path(“x”,”y")
>
> The above is equivalent to:
>
> SELECT * FROM addresses INNER JOIN people ON people.name=addresses.name
>
> NOTES:
>         1. Instead of select(), we simply jump to the root via db() (or
> V() for graph).
>         2. Instead of project(), we simply use value() or values().
>         3. Instead of select() being overloaded with by() join syntax, we
> use has() and path().
>                 - like TP3 we will be smart about dropping path() data
> once its no longer referenced.
>         4. We can also do LEFT and RIGHT JOINs (haven’t thought through
> FULL OUTER JOIN yet).
>                 - however, we don’t support ‘null' in TP so I don’t know
> if we want to support these null-producing joins. ?
>
> LEFT JOIN:
>         * If an address doesn’t exist for the person, emit a “null”-filled
> path.
>
> jdbc.db().values(“people”).as(“x”).
>   db().values(“addresses”).as(“y”).
>     choose(has(“name”,eq(path(“x”).by(“name”))),
>       identity(),
>       path(“y”).by(null).as(“y”)).
>   path(“x”,”y")
>
> SELECT * FROM addresses LEFT JOIN people ON people.name=addresses.name
>
> RIGHT JOIN:
>
> jdbc.db().values(“people”).as(“x”).
>   db().values(“addresses”).as(“y”).
>     choose(has(“name”,eq(path(“x”).by(“name”))),
>       identity(),
>       path(“x”).by(null).as(“x”)).
>   path(“x”,”y")
>
>
> SUMMARY:
>
> There are no “low level” instructions. Everything is based on the standard
> instructions that we know and love. Finally, if not apparent, the above
> bytecode chunks would ultimately get strategized into a single SQL query
> (breadth-first) instead of one-off queries (depth-first) to improve
> performance.
>
> Neat?,
> Marko.
>
> http://rredux.com <http://rredux.com/>
>
>
>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message