tinkerpop-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (TINKERPOP-1254) Support dropping traverser path information when it is no longer needed.
Date Sun, 10 Jul 2016 16:13:11 GMT

    [ https://issues.apache.org/jira/browse/TINKERPOP-1254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15369701#comment-15369701
] 

ASF GitHub Bot commented on TINKERPOP-1254:
-------------------------------------------

Github user twilmes commented on the issue:

    https://github.com/apache/tinkerpop/pull/358
  
    I made a small update to `ReferencePath` to create new label Sets when a patch is detached.
 This had been causing issues where the first set of labels for a path where being shared
across `MutablePaths` after detachment.  A label would be removed from one, and therefore
all of the traversers that had that label `Set` in their path, would be affected.
    
    The `PathProcessors` are now respecting keepLabels null and labels are not dropped if
`PrunePathStrategy` is not applied.
    
    **PrunePathStrategy on**
    ```
    g.V().match(
        as("a").in("sungBy").as("b"),
        as("a").in("sungBy").as("c"),
        as("b").out("writtenBy").as("d"),
        as("c").out("writtenBy").as("e"),
        as("d").has("name", "George_Harrison"),
        as("e").has("name", "Bob_Marley")).select("a").count().profile()
    
    Traversal Metrics
    Step                                                               Count  Traversers 
     Time (ms)    % Dur
    =============================================================================================================
    GraphStep(vertex,[])                                                 808         808 
        44.217    99.97
    MatchStep(AND,[[MatchStartStep(a), ProfileStep,...                     1           1 
         0.004     0.01
      MatchStartStep(a)                                                  808         808 
        43.517
      VertexStep(IN,[sungBy],vertex)                                     501         499 
        20.323
      MatchEndStep(b) (profiling ignored)                                                
         0.000
      MatchStartStep(a)                                                    2           2 
         0.006
      VertexStep(IN,[sungBy],vertex)                                     156         156 
         2.129
      MatchEndStep(c) (profiling ignored)                                                
         0.000
      MatchStartStep(b)                                                  501         499 
         5.126
      VertexStep(OUT,[writtenBy],vertex)                                 509         504 
         3.423
      MatchEndStep(d) (profiling ignored)                                                
         0.000
      MatchStartStep(c)                                                  156         156 
         1.083
      VertexStep(OUT,[writtenBy],vertex)                                 157         157 
         1.029
      MatchEndStep(e) (profiling ignored)                                                
         0.000
      MatchStartStep(d)                                                  509         266 
         1.685
      HasStep([name.eq(George_Harrison)])                                  2           2 
         0.002
      MatchEndStep (profiling ignored)                                                   
         0.000
      MatchStartStep(e)                                                  157          57 
         0.391
      HasStep([name.eq(Bob_Marley)])                                       1           1 
         0.001
      MatchEndStep (profiling ignored)                                                   
         0.000
    SelectOneStep(a)                                                       1           1 
         0.003     0.01
    CountGlobalStep                                                        1           1 
         0.003     0.01
                                                >TOTAL                     -          
-          44.228        -
    ```
    **PrunePathStrategy off**
    ```
    Traversal Metrics
    Step                                                               Count  Traversers 
     Time (ms)    % Dur
    =============================================================================================================
    GraphStep(vertex,[])                                                 808         808 
         7.565    99.84
    MatchStep(AND,[[MatchStartStep(a), ProfileStep,...                     1           1 
         0.007     0.10
      MatchStartStep(a)                                                  808         808 
         5.726
      VertexStep(IN,[sungBy],vertex)                                     501         499 
         9.532
      MatchEndStep(b) (profiling ignored)                                                
         0.000
      MatchStartStep(a)                                                    2           2 
         0.007
      VertexStep(IN,[sungBy],vertex)                                     156         156 
         1.803
      MatchEndStep(c) (profiling ignored)                                                
         0.000
      MatchStartStep(b)                                                  501         499 
         3.221
      VertexStep(OUT,[writtenBy],vertex)                                 509         504 
         2.297
      MatchEndStep(d) (profiling ignored)                                                
         0.000
      MatchStartStep(c)                                                  156         156 
         1.192
      VertexStep(OUT,[writtenBy],vertex)                                 157         157 
         0.910
      MatchEndStep(e) (profiling ignored)                                                
         0.000
      MatchStartStep(d)                                                  509         504 
         1.533
      HasStep([name.eq(George_Harrison)])                                  2           2 
         0.002
      MatchEndStep (profiling ignored)                                                   
         0.000
      MatchStartStep(e)                                                  157         157 
         1.425
      HasStep([name.eq(Bob_Marley)])                                       1           1 
         0.000
      MatchEndStep (profiling ignored)                                                   
         0.000
    SelectOneStep(a)                                                       1           1 
         0.003     0.04
    CountGlobalStep                                                        1           1 
         0.001     0.02
                                                >TOTAL                     -          
-           7.577        -
    ```
    
    I am running the integration tests now and will respond back when they complete.
    
     


> Support dropping traverser path information when it is no longer needed.
> ------------------------------------------------------------------------
>
>                 Key: TINKERPOP-1254
>                 URL: https://issues.apache.org/jira/browse/TINKERPOP-1254
>             Project: TinkerPop
>          Issue Type: Improvement
>          Components: process
>    Affects Versions: 3.1.1-incubating
>            Reporter: Marko A. Rodriguez
>            Assignee: Ted Wilmes
>
> The most expensive traversals (especially in OLAP) are those that can not be "bulked."
There are various reasons why two traversers at the same object can not be bulked, but the
primary reason is {{PATH}} or {{LABELED_PATH}}. That is, when the history of the traverser
is required, the probability of two traversers having the same history is low.
> A key to making traversals more efficient is to do as a much as possible to remove historic
information from a traverser so it can get bulked. How does one do this? 
> {code}
> g.V.as('a').out().as('b').out().where(neq('a').and().neq('b')).both().name
> {code}
> The {{LABELED_PATH}} of "a" and "b" are required up to the {{where()}} and at which point,
at {{both()}}, they are no longer required. It would be smart to support:
> {code}
> traverser.dropLabels(Set<String>)
> traverser.dropPath()
> {code}
> We would then, via a {{TraversalOptimizationStrategy}} insert a step between {{where()}}
and {{both()}} called {{PathPruneStep}} which would be a {{SideEffectStep}}. The strategy
would know which labels were no longer needed (via forward lookahead) and then do:
> {code}
> public class PathPruneStep {
>   final Set<String> dropLabels = ...
>   final boolean dropPath = ...
>   public void sideEffect(final Traverser<S> traverser) {
>     final Traverser<S> start = this.starts.next();
>     if(this.dropPath) start.dropPath();
>     else start.dropLabels(labels); 
>   }
> }
> {code}
> Again, the more we can prune historic path data no longer needed, the higher the probability
of bulking. Think about this in terms of {{match()}}.
> {code}
> g.V().match(
>   a.out.b,
>   b.out.c,
>   c.neq.a,
>   c.out.b,
> ).select("a")
> {code}
> All we need is "a" at the end. Thus, once a pattern has been passed and no future patterns
require that label, drop it! 
> This idea is related to TINKERPOP-331, but I don't think we should deal with manipulating
the species. Thus, I think 331 is too "low level."



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message