tinkerpop-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From pieter-gmail <pieter.mar...@gmail.com>
Subject Re: TraversalStrategies.setTraverserGeneratorFactory's removal
Date Thu, 07 Apr 2016 16:59:27 GMT
Hi,

I have been working on this without using a custom traverser. It is fine
but I do need to be able to set the path of the traverser.

We had a ticket for this previously here
<https://issues.apache.org/jira/browse/TINKERPOP-766>. I mentioned there
that I no longer needed the path setter and you mentioned that is ok but
it was never done.

To give some background.

g.V(a1).as("a").out().as("b").select("a", "b")

This will be compiled to one step. i.e. This means that the collapsed
steps label information is somewhat obfuscated and the path is
incorrectly calculated by the traverser.
However the label information is not lost and I recalculate the path but
need to set it on the traverser.

Is it still ok to make the path mutable?

Thanks
Pieter

On 30/03/2016 23:19, Marko Rodriguez wrote:
> Hi,
>
> So Titan does something similar where it takes a row in Cassandra and turns those into
Traversers. It uses FlatMapStep to do so where the iterator in FlatMapStep is a custom iterator
that knows how to do data conversions.
>
> Would something like that help?
>
> If not and you really need your own TraverserGenerator, then you can use reflection to
set it in DefaultTraversal. Its a private member now.
>
> Moving forward, I would highly recommend you don't create classes so low in the stack.
Graph database providers should only create (if necessary):
>
> 	1. Steps that extend non-final TinkerPop steps.
> 	2. TraversalStrategies that implement ProviderOptimizationStrategy.
> 	3. Classes that extend Graph, Vertex, Edge, Property, VertexProperty, and GraphComputer.
> 	4. Their own InputFormat or InputRDD if they want to have Spark/Giraph work against
them.
>
> Anything beyond that (I think) is starting to get into murky territory.
>
> Marko.
>
> http://markorodriguez.com
>
> On Mar 30, 2016, at 2:46 PM, pieter-gmail <pieter.martin@gmail.com> wrote:
>
>> Hi,
>>
>> I need it to keep state. Mapping sql ResultSet's grid nature to a graph
>> nature gets complex quickly and being able to store and manipulate the
>> state of the traverser makes it easier. Also it was possible and the
>> solution presented itself to me as such.
>>
>> A single row i.e. AbstractStep.processNextStart() from a sql ResultSet
>> might map to many traversers. This is the at the heart of what makes
>> Sqlg a worthwhile enterprise. To fetch lots of data (reduce latency
>> cost) in a denormalized manner and map it to a graph format.
>>
>> To manage this I needed to store state and add a custom method
>> "customSplit(...)". customSplit is similar to split() but it uses and
>> updates the said state. The custom traverser also keeps additional state
>> of where we are with respect to the processNextStart() (a sql row) as I
>> need it in order to calculate the next traverser from the same row.
>>
>> So if all this becomes impossible there is bound to be a different
>> solution to the same problem but it would require quite some thinking
>> and effort.
>>
>> With a little bit of arrogance and ignorance, perhaps letting OLAP
>> constraints leak into OLTP is not a good idea. I'd say OLTP is 99% of
>> use-cases so whatever these serialization issues are they ought to be
>> contained to OLAP.
>>
>> Thanks
>> Pieter
>>
>>
>>
>> On 30/03/2016 21:38, Marko Rodriguez wrote:
>>> Hello Pieter,
>>>
>>>> In SqlgGraph <https://github.com/pietermartin/sqlg/blob/schema/sqlg-core/src/main/java/org/umlg/sqlg/structure/SqlgGraph.java>
>>>> in a static code block invokes
>>>>
>>>> static {
>>>>  TraversalStrategies.GlobalCache.registerStrategies(Graph.class,
>>>> TraversalStrategies.GlobalCache.getStrategies(Graph.class).clone().addStrategies(new
>>>> SqlgVertexStepStrategy()));
>>>>  TraversalStrategies.GlobalCache.registerStrategies(Graph.class,
>>>> TraversalStrategies.GlobalCache.getStrategies(Graph.class).clone().addStrategies(new
>>>> SqlgGraphStepStrategy()));
>>>>  TraversalStrategies.GlobalCache.getStrategies(Graph.class).setTraverserGeneratorFactory(new
>>>> SqlgTraverserGeneratorFactory());
>>>> }
>>> This all looks great exception the TraverserGeneratorFactory. Traverser classes
are so low-level and so tied to serialization code in OLAP that I removed all concept of users
able to create traverser species. I need full control at that level to maneuver.
>>>
>>> I really need to create a section in the docs that says stuff like:
>>>
>>> 	* Graph System Providers: only implement steps that extend non-final TinkerPop-steps
(e.g. GraphStep, VertexStep, etc.).
>>> 	* Graph Language Providers: only have Traversal.steps() that can be represented
as a composition of TinkerPop-steps.
>>>
>>> When providers get too low level, then its hard for us to maneuver and optimize
and move forward with designs. There are so many assumption in the code that we make around
Traverser instances, Step interfaces, etc. that if people just make new ones, then strategies,
serialization, etc. breaks down.
>>>
>>> The question I have, why do you have your own Traverser implementation? I can't
imagine a reason for a provider needs their own traverser class. ??
>>>
>>> Thanks,
>>> Marko.
>>>
>>> http://markorodriguez.com
>


Mime
View raw message