metron-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From jagdeepsingh2 <...@git.apache.org>
Subject [GitHub] metron pull request #1245: METRON-1795: Initial Commit for Regular Expressio...
Date Mon, 10 Dec 2018 01:04:14 GMT
Github user jagdeepsingh2 commented on a diff in the pull request:

    https://github.com/apache/metron/pull/1245#discussion_r240064005
  
    --- Diff: metron-platform/metron-parsers/README.md ---
    @@ -52,6 +52,62 @@ There are two general types types of parsers:
            This is using the default value for `wrapEntityName` if that property is not set.
         * `wrapEntityName` : Sets the name to use when wrapping JSON using `wrapInEntityArray`.
 The `jsonpQuery` should reference this name.
         * A field called `timestamp` is expected to exist and, if it does not, then current
time is inserted.  
    +  * Regular Expressions Parser
    +      * `recordTypeRegex` : A regular expression to uniquely identify a record type.
    +      * `messageHeaderRegex` : A regular expression used to extract fields from a message
part which is common across all the messages.
    +      * `convertCamelCaseToUnderScore` : If this property is set to true, this parser
will automatically convert all the camel case property names to underscore seperated. 
    +          For example, following convertions will automatically happen:
    +
    +          ```
    +          ipSrcAddr -> ip_src_addr
    +          ipDstAddr -> ip_dst_addr
    +          ipSrcPort -> ip_src_port
    +          ```
    +          Note this property may be necessary, because java does not support underscores
in the named group names. So in case your property naming conventions requires underscores
in property names, use this property.
    +          
    +      * `fields` : A json list of maps contaning a record type to regular expression
mapping.
    +      
    +      A complete configuration example would look like:
    +      
    +      ```json
    +      "convertCamelCaseToUnderScore": true, 
    +      "recordTypeRegex": "kernel|syslog",
    +      "messageHeaderRegex": "(<syslogPriority>(<=^&lt;)\\d{1,4}(?=>)).*?(<timestamp>(<=>)[A-Za-z]
{3}\\s{1,2}\\d{1,2}\\s\\d{1,2}:\\d{1,2}:\\d{1,2}(?=\\s)).*?(<syslogHost>(<=\\s).*?(?=\\s))",
    --- End diff --
    
    I have added this explanation to the README. Thanks for the suggestion.


---

Mime
View raw message