metron-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (METRON-712) Separate evaluation from parsing in Stellar
Date Mon, 06 Mar 2017 14:29:32 GMT

    [ https://issues.apache.org/jira/browse/METRON-712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15897394#comment-15897394
] 

ASF GitHub Bot commented on METRON-712:
---------------------------------------

GitHub user cestella opened a pull request:

    https://github.com/apache/incubator-metron/pull/473

    METRON-712: Separate evaluation from parsing in Stellar

    # Description
    With the current implementation of Stellar, we cannot cache the parse tree and then apply
it after the fact. It's just an artifact of how we do the parsing: we actually execute the
statement as we parse rather than constructing an AST that can then be evaluated later given
a message. Essentially what I'm proposing is that we build the equivalent of Pattern.compile()
in Java except for Stellar.
    We should for multiple reasons:
    * code clarity - decoupling the stellar language from the generated parser code
    * performance - saving lexing and parsing for every message.  Also, the resulting parse-stack
may be much smaller than the somewhat complex.
    
    In this PR, I have added a google cache that will cache the resulting compiled expression
in `BaseStellarProcessor` for 10 minutes (by default).  I have also created a microbenchmarking
suite and have evaluated this on a few representative expressions.
    Results:
    * `TO_UPPER('casey')`
      * Median ms before: `880.5`
      * Median ms after: `15`
      * Speedup: 58.6x faster
    * `TO_LOWER(name)`
      * Median ms before: `497`
      * Median ms after: `3`
      * Speedup: 165.6x faster
    * `1 + 2*(3 + int_num) / 10.0`
      * Median ms before: `676`
      * Median ms after: `4`
      * Speedup: 169x faster
    * `1.5 + 2*(3 + double_num) / 10.0`
      * Median ms before: `634`
      * Median ms after: `1`
      * Speedup: 634x faster
    * `if ('foo' in ['foo']) OR one == very_nearly_one then 'one' else 'two'`
      * Median ms before: `616`
      * Median ms after: `23`
      * Speedup: 26x faster
    * `1 + 2*(3 + int_num) / 10.0`
      * Median ms before: `601`
      * Median ms after: `2`
      * Speedup: 300.5x faster
    * `DOMAIN_TO_TLD(domain)`
      * Median ms before: `505`
      * Median ms after: `16`
      * Speedup: 32.5x
    * `DOMAIN_REMOVE_SUBDOMAINS(domain)`
      * Median ms before: `496`
      * Median ms after: `11`
      * Speedup: 45x faster
    
    
    # Testing Plan
    Please refer to the METRON-744 [PR](https://github.com/apache/incubator-metron/pull/468#issue-210707129)
testing plan.
    
    In order to streamline the review of the contribution we ask you follow these guidelines
and ask you to double check
    the following:
    
    ### For all changes:
    - [x] Is there a JIRA ticket associated with this PR? If not one needs to be created at
[Metron Jira](https://issues.apache.org/jira/browse/METRON/?selectedTab=com.atlassian.jira.jira-projects-plugin:summary-panel).

    - [x] Does your PR title start with METRON-XXXX where XXXX is the JIRA number you are
trying to resolve? Pay particular attention to the hyphen "-" character.
    - [x] Has your PR been rebased against the latest commit within the target branch (typically
master)?
    
    
    ### For code changes:
    - [x] Have you included steps to reproduce the behavior or problem that is being changed
or addressed?
    - [x] Have you included steps or a guide to how the change may be verified and tested
manually?
    - [x] Have you ensured that the full suite of tests and checks have been executed in the
root incubating-metron folder via:
    
    ```
    mvn -q clean integration-test install && build_utils/verify_licenses.sh 
    ```
    
    - [x] Have you written or updated unit tests and or integration tests to verify your changes?
    - [x] If adding new dependencies to the code, are these dependencies licensed in a way
that is compatible for inclusion under [ASF 2.0](http://www.apache.org/legal/resolved.html#category-a)?

    - [ ] Have you verified the basic functionality of the build by building and running locally
with Vagrant full-dev environment or the equivalent?
    
    ### For documentation related changes:
    - [x] Have you ensured that format looks appropriate for the output in which it is rendered
by building and verifying the site-book? If not then run the following commands and the verify
changes via site-book/target/site/index.html.
    
    ```
    cd site-book
    bin/generate-md.sh
    mvn site:site
    
    ```
    
    ### Note:
    Please ensure that once the PR is submitted, you check travis-ci for build issues and
submit an update to your PR as soon as possible.
    It is also recommened that [travis-ci](https://travis-ci.org) is set up for your personal
repository such that your branches are built there before submitting a pull request.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/cestella/incubator-metron stellar_optimize

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/incubator-metron/pull/473.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #473
    
----
commit 7c04584950c452b5f0dc786de8d91f3978bb92ec
Author: cstella <cestella@gmail.com>
Date:   2017-03-06T08:15:57Z

    Renaming Compiler to Interpreter.

commit 7483361f395707a86031bc5eb7027259cc75786e
Author: cstella <cestella@gmail.com>
Date:   2017-03-06T08:16:25Z

    Merge branch 'master' into stellar_optimize

commit c2e7eb08a9c001bc0eef62e5a0423cd17227ef46
Author: cstella <cestella@gmail.com>
Date:   2017-03-06T10:21:54Z

    Added cache to speed up stellar

commit 3c416e7de9de7c96f80116884f264cab92f7f9e6
Author: cstella <cestella@gmail.com>
Date:   2017-03-06T10:24:36Z

    Deleted StellarInterpreter.

commit fbad7f31e6171abf4a8e2e0031d207a2de41f5ac
Author: cstella <cestella@gmail.com>
Date:   2017-03-06T11:22:36Z

    Adding microbenchmarking suite.

commit 7b9ce9a9a3b97059a25af05901f9077392a0e58a
Author: cstella <cestella@gmail.com>
Date:   2017-03-06T13:36:02Z

    Updating tests to expect real exceptions, not wrapped exceptions.

----


> Separate evaluation from parsing in Stellar
> -------------------------------------------
>
>                 Key: METRON-712
>                 URL: https://issues.apache.org/jira/browse/METRON-712
>             Project: Metron
>          Issue Type: Improvement
>            Reporter: Casey Stella
>
> With the current implementation of Stellar, we cannot cache the parse tree and then apply
it after the fact. It's just an artifact of how we do the parsing: we actually execute the
statement as we parse rather than constructing an AST that can then be evaluated later given
a message. Essentially what I'm proposing is that we build the equivalent of Pattern.compile()
in Java except for Stellar.
> We should for multiple reasons:
> * code clarity - decoupling the stellar language from the generated parser code
> * performance - saving lexing and parsing for every message



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message