tinkerpop-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (TINKERPOP-1254) Support dropping traverser path information when it is no longer needed.
Date Sun, 10 Jul 2016 18:30:11 GMT

    [ https://issues.apache.org/jira/browse/TINKERPOP-1254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15369889#comment-15369889
] 

ASF GitHub Bot commented on TINKERPOP-1254:
-------------------------------------------

Github user okram commented on the issue:

    https://github.com/apache/tinkerpop/pull/358
  
    Verified the following simply traversal on Grateful Dead graph. For OLTP (`MatchStep.standardAlgorithm()`),
I added a lazy barrier internal to `MatchStep` which is similar in behavior to the lazy barrier
OLTP optimization added in `PrunePathStrategy`.
    
    ```
    source.V().match(
        as("a").out().as("b"),
        as("b").out().as("c")).
      select("c").count().profile()
    ```
    
    *With and then without PrunePathStrategy OLAP*
    
    ```
    Traversal Metrics
    Step                                                               Count  Traversers 
     Time (ms)    % Dur
    =============================================================================================================
    GraphStep(vertex,[])                                                 808         808 
        40.418    43.47
    MatchStep(AND,[[MatchStartStep(a), ProfileStep,...                327370         452 
        28.015    30.13
      MatchStartStep(a)                                                  808         808 
        61.751
      VertexStep(OUT,vertex)                                            8049        7957 
        37.971
      MatchEndStep(b) (profiling ignored)                                                
         0.000
      MatchStartStep(b)                                                 8049         562 
        14.453
      VertexStep(OUT,vertex)                                          327370        7510 
        16.124
      MatchEndStep(c) (profiling ignored)                                                
         0.000
    SelectOneStep(c)                                                  327370         452 
        24.531    26.39
    CountGlobalStep                                                        1           1 
         0.004     0.00
                                                >TOTAL                     -          
-          92.969        -
    Traversal Metrics
    Step                                                               Count  Traversers 
     Time (ms)    % Dur
    =============================================================================================================
    GraphStep(vertex,[])                                                 808         808 
         6.844     0.35
    MatchStep(AND,[[MatchStartStep(a), ProfileStep,...                327370      326983 
      1257.050    64.22
      MatchStartStep(a)                                                  808         808 
        11.806
      VertexStep(OUT,vertex)                                            8049        7957 
        30.286
      MatchEndStep(b) (profiling ignored)                                                
         0.000
      MatchStartStep(b)                                                 8049        7957 
        14.627
      VertexStep(OUT,vertex)                                          327370      326983 
       774.934
      MatchEndStep(c) (profiling ignored)                                                
         0.000
    SelectOneStep(c)                                                  327370      326983 
       693.564    35.43
    CountGlobalStep                                                        1           1 
         0.001     0.00
                                                >TOTAL                     -          
-        1957.460        -
    ```
    
    *With and then without PrunePathStrategy OLTP*
    
    ```
    Traversal Metrics
    Step                                                               Count  Traversers 
     Time (ms)    % Dur
    =============================================================================================================
    TinkerGraphStep(vertex,[])                                           808         808 
         2.050     0.07
    MatchStep(AND,[[MatchStartStep(a), ProfileStep,...                327370       56814 
      2794.288    91.47
      MatchStartStep(a)                                                  808         808 
       103.524
      VertexStep(OUT,vertex)                                            8049        8049 
       107.345
      MatchEndStep(b)                                                   8049        8049 
       159.505
      MatchStartStep(b)                                                 8049        7957 
       105.127
      VertexStep(OUT,vertex)                                          327370      327370 
       351.935
      MatchEndStep(c)                                                 327370      327370 
      1331.478
    NoOpBarrierStep(1000)                                             327370         452 
       254.309     8.32
    SelectOneStep(c)                                                  327370         452 
         3.731     0.12
    CountGlobalStep                                                        1           1 
         0.465     0.02
                                                >TOTAL                     -          
-        3054.845        -
    
    Traversal Metrics
    Step                                                               Count  Traversers 
     Time (ms)    % Dur
    =============================================================================================================
    TinkerGraphStep(vertex,[])                                           808         808 
      2023.710    36.87
    MatchStep(AND,[[MatchStartStep(a), ProfileStep,...                327370      326983 
      2995.219    54.57
      MatchStartStep(a)                                                  808         808 
       221.105
      VertexStep(OUT,vertex)                                            8049        8049 
       185.946
      MatchEndStep(b)                                                   8049        8049 
       186.814
      MatchStartStep(b)                                                 8049        7957 
       161.012
      VertexStep(OUT,vertex)                                          327370      327370 
       405.787
      MatchEndStep(c)                                                 327370      327370 
       423.213
    SelectOneStep(c)                                                  327370      326983 
       324.356     5.91
    CountGlobalStep                                                        1           1 
       145.952     2.66
                                                >TOTAL                     -          
-        5489.238        -
    ```


> Support dropping traverser path information when it is no longer needed.
> ------------------------------------------------------------------------
>
>                 Key: TINKERPOP-1254
>                 URL: https://issues.apache.org/jira/browse/TINKERPOP-1254
>             Project: TinkerPop
>          Issue Type: Improvement
>          Components: process
>    Affects Versions: 3.1.1-incubating
>            Reporter: Marko A. Rodriguez
>            Assignee: Ted Wilmes
>
> The most expensive traversals (especially in OLAP) are those that can not be "bulked."
There are various reasons why two traversers at the same object can not be bulked, but the
primary reason is {{PATH}} or {{LABELED_PATH}}. That is, when the history of the traverser
is required, the probability of two traversers having the same history is low.
> A key to making traversals more efficient is to do as a much as possible to remove historic
information from a traverser so it can get bulked. How does one do this? 
> {code}
> g.V.as('a').out().as('b').out().where(neq('a').and().neq('b')).both().name
> {code}
> The {{LABELED_PATH}} of "a" and "b" are required up to the {{where()}} and at which point,
at {{both()}}, they are no longer required. It would be smart to support:
> {code}
> traverser.dropLabels(Set<String>)
> traverser.dropPath()
> {code}
> We would then, via a {{TraversalOptimizationStrategy}} insert a step between {{where()}}
and {{both()}} called {{PathPruneStep}} which would be a {{SideEffectStep}}. The strategy
would know which labels were no longer needed (via forward lookahead) and then do:
> {code}
> public class PathPruneStep {
>   final Set<String> dropLabels = ...
>   final boolean dropPath = ...
>   public void sideEffect(final Traverser<S> traverser) {
>     final Traverser<S> start = this.starts.next();
>     if(this.dropPath) start.dropPath();
>     else start.dropLabels(labels); 
>   }
> }
> {code}
> Again, the more we can prune historic path data no longer needed, the higher the probability
of bulking. Think about this in terms of {{match()}}.
> {code}
> g.V().match(
>   a.out.b,
>   b.out.c,
>   c.neq.a,
>   c.out.b,
> ).select("a")
> {code}
> All we need is "a" at the end. Thus, once a pattern has been passed and no future patterns
require that label, drop it! 
> This idea is related to TINKERPOP-331, but I don't think we should deal with manipulating
the species. Thus, I think 331 is too "low level."



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message