tinkerpop-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From pieter-gmail <pieter.mar...@gmail.com>
Subject Re: TraversalStrategies.setTraverserGeneratorFactory's removal
Date Thu, 07 Apr 2016 18:11:45 GMT
I mean to make the Traverser.path variable mutable.

traverser.setPath(path)

However let met see if I can simulate the splits as you indicated.

Thanks
Pieter

On 07/04/2016 19:54, Marko Rodriguez wrote:
> Hi,
>
> There is ImmutablePath and MutablePath. ImmutablePath is used in OLTP and is much more
efficient in terms of space & time than MutablePath. You can create either as you please
you just do:
>
> Path path = XXXPath.make()
> path = path.extend(…)
> path = path.extend(…)
>
> However, I don't recommend you work at that level. What I would recommend you do is this:
>
> public class MyBigSelectLabelStep<S,E> {
>
>   Traverser traverser = this.starts.next();
>   Map<String,Object> result = doLowLevelProviderSpecificStuff(traverser);
>   traverser = traverser.split(result.get("a"), EmptyStep.instance()); // simulate GraphStep
>   traverser.addLabels("a")
>   traverser = traverser.split(result.get("b"), EmptyStep.instance()); // simulate VertexStep
>   traverser.addLabels("b") 
>   traverser = traverser.split(result, EmptyStep.instance()); // simulate SelectStep
>   return traverser;
>
> }
>
> This handles all the low-level mechanisms of generating a traverser that looks like it
went through multiple steps even though it only went through one. This is hand-typed from
memory of the API so please be aware that I might have gotten an argument wrong or something.
Also, out() is a FlatMapStep and thus, processes an iterator of results -- you will have to
be smart to flatten that iterator… see FlatMapStep's implementation for how this is typically
handled in TinkerPop.
>
> HTH,
> Marko. 
>
> http://markorodriguez.com
>
> On Apr 7, 2016, at 10:59 AM, pieter-gmail <pieter.martin@gmail.com> wrote:
>
>> Hi,
>>
>> I have been working on this without using a custom traverser. It is fine
>> but I do need to be able to set the path of the traverser.
>>
>> We had a ticket for this previously here
>> <https://issues.apache.org/jira/browse/TINKERPOP-766>. I mentioned there
>> that I no longer needed the path setter and you mentioned that is ok but
>> it was never done.
>>
>> To give some background.
>>
>> g.V(a1).as("a").out().as("b").select("a", "b")
>>
>> This will be compiled to one step. i.e. This means that the collapsed
>> steps label information is somewhat obfuscated and the path is
>> incorrectly calculated by the traverser.
>> However the label information is not lost and I recalculate the path but
>> need to set it on the traverser.
>>
>> Is it still ok to make the path mutable?
>>
>> Thanks
>> Pieter
>>
>> On 30/03/2016 23:19, Marko Rodriguez wrote:
>>> Hi,
>>>
>>> So Titan does something similar where it takes a row in Cassandra and turns those
into Traversers. It uses FlatMapStep to do so where the iterator in FlatMapStep is a custom
iterator that knows how to do data conversions.
>>>
>>> Would something like that help?
>>>
>>> If not and you really need your own TraverserGenerator, then you can use reflection
to set it in DefaultTraversal. Its a private member now.
>>>
>>> Moving forward, I would highly recommend you don't create classes so low in the
stack. Graph database providers should only create (if necessary):
>>>
>>> 	1. Steps that extend non-final TinkerPop steps.
>>> 	2. TraversalStrategies that implement ProviderOptimizationStrategy.
>>> 	3. Classes that extend Graph, Vertex, Edge, Property, VertexProperty, and GraphComputer.
>>> 	4. Their own InputFormat or InputRDD if they want to have Spark/Giraph work
against them.
>>>
>>> Anything beyond that (I think) is starting to get into murky territory.
>>>
>>> Marko.
>>>
>>> http://markorodriguez.com
>>>
>>> On Mar 30, 2016, at 2:46 PM, pieter-gmail <pieter.martin@gmail.com> wrote:
>>>
>>>> Hi,
>>>>
>>>> I need it to keep state. Mapping sql ResultSet's grid nature to a graph
>>>> nature gets complex quickly and being able to store and manipulate the
>>>> state of the traverser makes it easier. Also it was possible and the
>>>> solution presented itself to me as such.
>>>>
>>>> A single row i.e. AbstractStep.processNextStart() from a sql ResultSet
>>>> might map to many traversers. This is the at the heart of what makes
>>>> Sqlg a worthwhile enterprise. To fetch lots of data (reduce latency
>>>> cost) in a denormalized manner and map it to a graph format.
>>>>
>>>> To manage this I needed to store state and add a custom method
>>>> "customSplit(...)". customSplit is similar to split() but it uses and
>>>> updates the said state. The custom traverser also keeps additional state
>>>> of where we are with respect to the processNextStart() (a sql row) as I
>>>> need it in order to calculate the next traverser from the same row.
>>>>
>>>> So if all this becomes impossible there is bound to be a different
>>>> solution to the same problem but it would require quite some thinking
>>>> and effort.
>>>>
>>>> With a little bit of arrogance and ignorance, perhaps letting OLAP
>>>> constraints leak into OLTP is not a good idea. I'd say OLTP is 99% of
>>>> use-cases so whatever these serialization issues are they ought to be
>>>> contained to OLAP.
>>>>
>>>> Thanks
>>>> Pieter
>>>>
>>>>
>>>>
>>>> On 30/03/2016 21:38, Marko Rodriguez wrote:
>>>>> Hello Pieter,
>>>>>
>>>>>> In SqlgGraph <https://github.com/pietermartin/sqlg/blob/schema/sqlg-core/src/main/java/org/umlg/sqlg/structure/SqlgGraph.java>
>>>>>> in a static code block invokes
>>>>>>
>>>>>> static {
>>>>>> TraversalStrategies.GlobalCache.registerStrategies(Graph.class,
>>>>>> TraversalStrategies.GlobalCache.getStrategies(Graph.class).clone().addStrategies(new
>>>>>> SqlgVertexStepStrategy()));
>>>>>> TraversalStrategies.GlobalCache.registerStrategies(Graph.class,
>>>>>> TraversalStrategies.GlobalCache.getStrategies(Graph.class).clone().addStrategies(new
>>>>>> SqlgGraphStepStrategy()));
>>>>>> TraversalStrategies.GlobalCache.getStrategies(Graph.class).setTraverserGeneratorFactory(new
>>>>>> SqlgTraverserGeneratorFactory());
>>>>>> }
>>>>> This all looks great exception the TraverserGeneratorFactory. Traverser
classes are so low-level and so tied to serialization code in OLAP that I removed all concept
of users able to create traverser species. I need full control at that level to maneuver.
>>>>>
>>>>> I really need to create a section in the docs that says stuff like:
>>>>>
>>>>> 	* Graph System Providers: only implement steps that extend non-final
TinkerPop-steps (e.g. GraphStep, VertexStep, etc.).
>>>>> 	* Graph Language Providers: only have Traversal.steps() that can be
represented as a composition of TinkerPop-steps.
>>>>>
>>>>> When providers get too low level, then its hard for us to maneuver and
optimize and move forward with designs. There are so many assumption in the code that we make
around Traverser instances, Step interfaces, etc. that if people just make new ones, then
strategies, serialization, etc. breaks down.
>>>>>
>>>>> The question I have, why do you have your own Traverser implementation?
I can't imagine a reason for a provider needs their own traverser class. ??
>>>>>
>>>>> Thanks,
>>>>> Marko.
>>>>>
>>>>> http://markorodriguez.com
>


Mime
View raw message