metron-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From cestella <...@git.apache.org>
Subject [GitHub] incubator-metron issue #419: METRON-664: Make the index configuration per-wr...
Date Wed, 18 Jan 2017 17:05:16 GMT
Github user cestella commented on the issue:

    https://github.com/apache/incubator-metron/pull/419
  
    Testing Instructions beyond the normal smoke test (i.e. letting data
    flow through to the indices and checking them).
    
    ## Preliminaries
    
    Since I will use the squid topology to pass data through in a controlled
    way, we must install squid and generate one point of data:
    * `yum install -y squid`
    * `service squid start`
    * `squidclient http://www.yahoo.com`
    
    Also, set an environment variable to indicate `METRON_HOME`:
    * `export METRON_HOME=/usr/metron/0.3.0` 
    
    ## Free Up Space on the virtual machine
    
    First, let's free up some headroom on the virtual machine.  If you are running this on
a
    multinode cluster, you would not have to do this.
    * Kill monit via `service monit stop`
    * Kill tcpreplay via `for i in $(ps -ef | grep tcpreplay | awk '{print $2}');do kill -9
$i;done`
    * Kill existing parser topologies via 
       * `storm kill snort`
       * `storm kill bro`
    * Kill flume via `for i in $(ps -ef | grep flume | awk '{print $2}');do kill -9 $i;done`
    * Kill yaf via `for i in $(ps -ef | grep yaf | awk '{print $2}');do kill -9 $i;done`
    * Kill bro via `for i in $(ps -ef | grep bro | awk '{print $2}');do kill -9 $i;done`
    
    ## Deploy the squid parser
    * Create the squid kafka topic: `/usr/hdp/current/kafka-broker/bin/kafka-topics.sh --zookeeper
node1:2181 --create --topic squid --partitions 1 --replication-factor 1`
    * Start via `$METRON_HOME/bin/start_parser_topology.sh -k node1:6667 -z node1:2181 -s
squid`
    
    ### Test Case 0: Base Case Test
    * Delete any squid index that currently exists (if any do) via `curl -XDELETE "http://localhost:9200/squid*"`
    * Send 1 data points through and ensure that there are no data points in the index:
      * `cat /var/log/squid/access.log | /usr/hdp/current/kafka-broker/bin/kafka-console-producer.sh
--broker-list node1:6667 --topic squid`
      * `curl "http://localhost:9200/squid*/_search?pretty=true&q=*:*" 2> /dev/null|
grep "full_hostname" | wc -l` should yield  `1`
    * Validate that the Storm UI for the indexing topology indicates a warning in the console
for both the "hdfsIndexingBolt" and "indexingBolt" to the effect of `java.lang.Exception:
WARNING: Default and (likely) unoptimized writer config used for hdfs writer and sensor squid`
and `java.lang.Exception: WARNING: Default and (likely) unoptimized writer config used for
elasticsearch writer and sensor squid` respectively 
    
    ### Test Case 1: Adjusting batch sizes independently
    * Delete any squid index that currently exists (if any do) via `curl -XDELETE "http://localhost:9200/squid*"`
    * Create a file at `$METRON_HOME/config/zookeeper/indexing/squid.json` with the following
contents:
    ```
    {
      "hdfs" : {
        "index": "squid",
        "batchSize": 1,
        "enabled" : true
      },
      "elasticsearch" : {
        "index": "squid",
        "batchSize": 5,
        "enabled" : true
      }
    }
    ```
    * Push the configs via `$METRON_HOME/bin/zk_load_configs.sh -m PUSH -i $METRON_HOME/config/zookeeper
-z node1:2181`
    * Send 4 data points through and ensure:
      * `cat /var/log/squid/access.log /var/log/squid/access.log /var/log/squid/access.log
/var/log/squid/access.log | /usr/hdp/current/kafka-broker/bin/kafka-console-producer.sh --broker-list
node1:6667 --topic squid`
      * `curl "http://localhost:9200/squid*/_search?pretty=true&q=*:*" 2> /dev/null|
grep "full_hostname" | wc -l` should yield  `0` 
      * `hadoop fs -cat /apps/metron/indexing/indexed/squid/enrichment-null* | wc -l` should
yield `4`
    * Send a final data point through and ensure:
      * `cat /var/log/squid/access.log | /usr/hdp/current/kafka-broker/bin/kafka-console-producer.sh
--broker-list node1:6667 --topic squid`
      * `curl "http://localhost:9200/squid*/_search?pretty=true&q=*:*" 2> /dev/null|
grep "full_hostname" | wc -l` should yield  `5` 
      * `hadoop fs -cat /apps/metron/indexing/indexed/squid/enrichment-null* | wc -l` should
yield `5`
     
    ### Test Case 2: Turn off HDFS writer
    * Delete any squid index that currently exists (if any do) via `curl -XDELETE "http://localhost:9200/squid*"`
    * Edit the file at `$METRON_HOME/config/zookeeper/indexing/squid.json` to the following
contents:
    ```
    {
      "hdfs" : {
        "index": "squid",
        "batchSize": 1,
        "enabled" : false 
      },
      "elasticsearch" : {
        "index": "squid",
        "batchSize": 1,
        "enabled" : true
      }
    }
    ```
    * Send 1 data points through and ensure:
      * `cat /var/log/squid/access.log | /usr/hdp/current/kafka-broker/bin/kafka-console-producer.sh
--broker-list node1:6667 --topic squid`
      * `curl "http://localhost:9200/squid*/_search?pretty=true&q=*:*" 2> /dev/null|
grep "full_hostname" | wc -l` should yield  `1`
      * `hadoop fs -cat /apps/metron/indexing/indexed/squid/enrichment-null* | wc -l` should
yield `0`
    
    ### Test Case 3: Stellar Management Functions
    * Execute the following in the stellar shell:
    ```
    Stellar, Go!
    Please note that functions are loading lazily in the background and will be unavailable
until loaded fully.
    {es.clustername=metron, es.ip=node1, es.port=9300, es.date.format=yyyy.MM.dd.HH}
    [Stellar]>>> # Grab the indexing config
    [Stellar]>>> squid_config := CONFIG_GET('INDEXING', 'squid', true)
    [Stellar]>>>
    [Stellar]>>> # Update the index and batch size
    [Stellar]>>> squid_config := INDEXING_SET_BATCH( INDEXING_SET_INDEX(squid_config,
'hdfs', 'squid'), 'hdfs', 2)
    [Stellar]>>> # Push the config to zookeeper
    [Stellar]>>> CONFIG_PUT('INDEXING', squid_config, 'squid')
    [Stellar]>>> # Grab the updated config from zookeeper
    [Stellar]>>> CONFIG_GET('INDEXING', 'squid')
    {
      "hdfs" : {
        "index" : "squid",
        "batchSize" : 2,
        "enabled" : false
      },
      "elasticsearch" : {
        "index" : "squid",
        "batchSize" : 1,
        "enabled" : true
      }
    }
    ```
    * Confirm that the dump command from `$METRON_HOME/bin/zk_load_configs.sh -m DUMP -z node1:2181`
contains the config with batch size of `1`
    * Now pull the configs locally via `$METRON_HOME/bin/zk_load_configs.sh -m PULL -z node1:2181
-o $METRON_HOME/config/zookeeper -f`
    * Check that the "hdfs" config at `$METRON_HOME/config/zookeeper/indexing/squid.json`
is indeed:
    ```
    {
      "index" : "squid",
      "batchSize" : 2,
      "enabled" : false
    }
    ```



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

Mime
View raw message