tinkerpop-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From twilmes <...@git.apache.org>
Subject [GitHub] tinkerpop pull request #358: TINKERPOP-1254 Support dropping traverser path ...
Date Fri, 08 Jul 2016 13:30:04 GMT
GitHub user twilmes opened a pull request:

    https://github.com/apache/tinkerpop/pull/358

    TINKERPOP-1254 Support dropping traverser path information when it is no longer needed

    This PR adds support for path retraction to increase the likelihood of bulking in OLTP
and OLAP modes. Traversal analysis is performed during the application of the PrunePathStrategy
to identify labels that may be dropped at various points in the traversal.  MatchStep also
performs runtime analysis to determine which labels it can drop in addition to the labels
identified during traversal strategy application.
    
    Here is a set of profiles showing the benefit of path dropping.  These were generated
with `TinkerGraphComputer` first against 3.2 and then TinkerPop-1254.  **Note** that the times
should not be compared here.  The first was run on my anemic macbook which, and the second
was run on an 8 core AWS m3.2xlarge instance that I've been using for testing.  I'll be following
up with more numbers on the same hardware but you can see here the dramatic drop in traversers
with path pruning enabled.
    
    **TP 3.2**
    ```
    gremlin> g.V().match(__.as('a').out().as('b'), __.as('b').out().as('c'), __.as('c').out().as('d')).select('d').count().profile()
    ==>Traversal Metrics
    Step                                                               Count  Traversers 
     Time (ms)    % Dur
    =============================================================================================================
    TinkerGraphStep(vertex,[])                                           808         808 
        21.205     0.02
    MatchStep(AND,[[MatchStartStep(a), ProfileStep,...              14465066    14465066 
    110317.813    85.88
      MatchStartStep(a)                                                  808         808 
     10975.797
      VertexStep(OUT,vertex)                                            8049        8049 
      6953.071
      MatchEndStep(b)                                                   8049        8049 
      6727.167
      MatchStartStep(b)                                                 8049        7957 
      5461.242
      VertexStep(OUT,vertex)                                          327370      327370 
      6782.024
      MatchEndStep(c)                                                 327370      327370 
      6238.268
      MatchStartStep(c)                                               327370      326983 
      1771.773
      VertexStep(OUT,vertex)                                        14465066    14465066 
     11489.228
      MatchEndStep(d)                                               14465066    14465066 
     14301.313
    SelectOneStep(d)                                                14465066    14465066 
     13752.966    10.71
    CountGlobalStep                                                        1           1 
      4363.667     3.40
                                                >TOTAL                     -          
-      128455.652        -
    ```
     **TinkerPop-1254**
    ```
    gremlin> g.V().match(__.as('a').out().as('b'), __.as('b').out().as('c'), __.as('c').out().as('d')).select('d').count().profile()
    ==>Traversal Metrics
    Step                                                               Count  Traversers 
     Time (ms)    % Dur
    =============================================================================================================
    GraphStep(vertex,[])                                                 808         808 
        32.453    19.96
    MatchStep(AND,[[MatchStartStep(a), ProfileStep,...              14465066        7510 
        89.463    55.01
      MatchStartStep(a)                                                  808         808 
        22.388
      VertexStep(OUT,vertex)                                            8049        7957 
        85.493
      MatchEndStep(b) (profiling ignored)                                                
         0.000
      MatchStartStep(b)                                                 8049         563 
         7.488
      VertexStep(OUT,vertex)                                          327370        7561 
        19.548
      MatchEndStep(c) (profiling ignored)                                                
         0.000
      MatchStartStep(c)                                               327370         452 
         4.247
      VertexStep(OUT,vertex)                                        14465066        7510 
        14.812
      MatchEndStep(d) (profiling ignored)                                                
         0.000
    SelectOneStep(d)                                                14465066         452 
        40.711    25.03
    CountGlobalStep                                                        1           1 
         0.001     0.00
                                                >TOTAL                     -          
-         162.630        -
    ```
    
    One thing that I did want to call out.  I was seeing some strange errors in the `MutablePath.retract`
method that I traced to multiple threads attempting to do a retract at the same time.  I did
not think this should happen or would be possible but I may be misunderstanding how `GraphComputer`
executes.  To fix this, I added a `synchronized` block.  I was hesitant to do this but it
fixed the problem and appears to have not slowed things down appreciably.  There may be a
better way around this though so any input would be welcome.
    
    Tests pass and I got a passing integration test last night but made a few changes since
then.  I'll follow up with new integration test results when they finish running today.
    
    VOTE: +1 

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/apache/tinkerpop TINKERPOP-1254

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/tinkerpop/pull/358.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #358
    
----
commit 88c5d27bd4d0e2bb898be7f4c3a86007d6422174
Author: Ted Wilmes <twilmes@gmail.com>
Date:   2016-07-08T12:37:41Z

    TINKERPOP-1254 Support dropping traverser path information when it is no longer needed
    
    This commit adds support for path retraction to increase the likelihood of bulking in
OLTP and OLAP modes. Traversal analysis is performed during the application of the PrunePathStrategy
to identify labels that may be dropped at various points in the traversal.  MatchStep also
performs runtime analysis to determine which labels it can drop in addition to the labels
identified during traversal strategy application.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

Mime
View raw message