metron-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From cestella <...@git.apache.org>
Subject [GitHub] incubator-metron issue #435: METRON-684: Decouple Timestamp calculation from...
Date Thu, 02 Feb 2017 20:25:08 GMT
Github user cestella commented on the issue:

    https://github.com/apache/incubator-metron/pull/435
  
    Testing Instructions beyond the normal smoke test (i.e. letting data
    flow through to the indices and checking them).
    
    ## Preliminaries
    * Set an environment variable to indicate `METRON_HOME`:
    `export METRON_HOME=/usr/metron/0.3.0` 
    
    * Create the profiler hbase table
    `echo "create 'profiler', 'P'" | hbase shell`
    
    * Open `~/rand_gen.py` and paste the following:
    ```
    #!/usr/bin/python
    import random
    import sys
    import time
    def main():
      mu = float(sys.argv[1])
      sigma = float(sys.argv[2])
      freq_s = int(sys.argv[3])
      while True:
        out = '{ "value" : ' + str(random.gauss(mu, sigma)) + ' }'
        print out
        sys.stdout.flush()
        time.sleep(freq_s)
    
    if __name__ == '__main__':
      main()
    ```
    This will generate random JSON maps with a numeric field called `value`
    
    * Set the profiler to use 1 minute tick durations:
      * Edit `$METRON_HOME/config/profiler.properties` to adjust the capture duration by changing
`profiler.period.duration=15` to `profiler.period.duration=1`
      * Edit `$METRON_HOME/config/zookeeper/global.json` and add the following properties:
    ```
    "profiler.client.period.duration" : "1",
    "profiler.client.period.duration.units" : "MINUTES"
    ```
    
    ## Free Up Space on the virtual machine
    
    First, let's free up some headroom on the virtual machine.  If you are running this on
a
    multinode cluster, you would not have to do this.
    * Kill monit via `service monit stop`
    * Kill tcpreplay via `for i in $(ps -ef | grep tcpreplay | awk '{print $2}');do kill -9
$i;done`
    * Kill existing parser topologies via 
       * `storm kill snort`
       * `storm kill bro`
    * We won't need the enrichment or indexing topologies for this test, so you can kill them
via:
       * `storm kill enrichment`
       * `storm kill indexing`
    * Kill yaf via `for i in $(ps -ef | grep yaf | awk '{print $2}');do kill -9 $i;done`
    * Kill bro via `for i in $(ps -ef | grep bro | awk '{print $2}');do kill -9 $i;done`
    
    ## Start the profiler
    * `$METRON_HOME/bin/start_profiler_topology.sh`
    
    ## Test Case
    
    * Set up a profile to accept some synthetic data with a numeric `value` field and persist
a stats summary of the data
      * Edit `$METRON_HOME/config/zookeeper/profiler.json` and paste in the following:
    ```
    {
      "profiles": [
        {
          "profile": "stat",
          "foreach": "'global'",
          "onlyif": "true",
          "init" : {
                   },
          "update": {
            "s": "STATS_ADD(s, value)"
                    },
          "result": "s"
        }
      ]
    }
    ```
    
    * Send some synthetic data directly to the profiler:
    `python ~/rand_gen.py 0 1 1 | /usr/hdp/current/kafka-broker/bin/kafka-console-producer.sh
--broker-list node1:6667 --topic indexing`
    * Wait for at least 10 minutes and execute the following via the Stellar REPL:
    ```
    # Grab the last 10 minutes worth of timestamps
    PROFILE_FIXED( 10, 'MINUTES')
    # Looks like 10 were returned, great.  Now, validate that I get 10 profile measurements
back
    PROFILE_GET('stat', 'global', PROFILE_FIXED( 10, 'MINUTES' ) )
    # Ok, now look at the mean across the distribution
    # STATS_MEAN( STATS_MERGE(PROFILE_GET('stat', 'global', PROFILE_FIXED( 10, 'MINUTES' )
)))
    ```
    For me, the following was the result:
    ```
    Stellar, Go!
    Please note that functions are loading lazily in the background and will be unavailable
until loaded fully.
    {es.clustername=metron, es.ip=node1, es.port=9300, es.date.format=yyyy.MM.dd.HH, profiler.client.period.duration=1,
profiler.client.period.duration.units=MINUTES}
    [Stellar]>>> # Grab the last 10 minutes worth of timestamps
    [Stellar]>>> PROFILE_FIXED( 10, 'MINUTES')
    Functions loaded, you may refer to functions now...
    [24767772, 24767773, 24767774, 24767775, 24767776, 24767777, 24767778, 24767779, 24767780,
24767781, 24767782]
    [Stellar]>>> # Looks like 10 were returned, great.  Now, validate that I get
10 profile measurements back
    [Stellar]>>> PROFILE_GET('stat', 'global', PROFILE_FIXED( 10, 'MINUTES' ) )
    [org.apache.metron.statistics.OnlineStatisticsProvider@44749031, org.apache.metron.statistics.OnlineStatisticsProvider@d2a7fbb9,
org.apache.metron.statistics.OnlineStatisticsProvider@a217cfd7, org.apache.metron.statistics.OnlineStatisticsProvider@c5e42aed,
org.apache.metron.statistics.OnlineStatisticsProvider@c4f4753d, org.apache.metron.statistics.OnlineStatisticsProvider@87a1606a,
org.apache.metron.statistics.OnlineStatisticsProvider@e1b4c8dc, org.apache.metron.statistics.OnlineStatisticsProvider@fdb7b8d8]
    [Stellar]>>> # Ok, now look at the mean across the distribution
    [Stellar]>>> STATS_MEAN( STATS_MERGE(PROFILE_GET('stat', 'global', PROFILE_FIXED(
10, 'MINUTES' ) )))
    -0.0077433441069769265
    [Stellar]>>>
    ```



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

Mime
View raw message