metron-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From GitBox <...@apache.org>
Subject [GitHub] [metron] nickwallen opened a new pull request #1556: METRON-2284 Metron Profiler for Spark doesn't work as expected
Date Fri, 08 Nov 2019 20:05:06 GMT
nickwallen opened a new pull request #1556: METRON-2284 Metron Profiler for Spark doesn't work
as expected
URL: https://github.com/apache/metron/pull/1556
 
 
   ### The Problem 
   
   Some profile "update" expressions execute incorrectly in the Batch Profiler. The bug report
provides an example where a call to `IS_EMPTY` returns true, when it should be returning false
when the profile is executed by the Batch Profiler.  The same profile executed in the REPL
or in the Streaming Profiler, returns the expected result.
   
   ### Root Cause
   
   The values contained within a telemetry message are exposed to a profile's update expression
at runtime.  This allows the profile to refer to message fields by name.  
   
   After message routing occurs, the telemetry message is corrupted during serialization.
It appears that all type information is lost when the corruption occurs.  This causes variable
resolution for message fields to fail when executing a profile’s ‘update’ expression.
 
   * This does not impact variables defined within the profile itself. 
   * This does not impact the ‘onlyif’, ‘foreach’, ‘init’, and 'result' expressions
of a profile; only the 'update' expression.
   
   The corruption occurs when a `MessageRoute` is serialized. This only affects the `JSONObject`
representing the telemetry message, rather than any other fields like the profile definition,
entity, or timestamp data also contained within a `MessageRoute`. 
    
   The `MapVariableResolver` is passed the corrupted `JSONObject` so that variables can be
resolved from the fields contained within the message.  Due to the corruption variables referring
to the message are not resolved.
   
   The corruption is caused by the use of Spark's bean encoder when serializing `MessageRoute`
objects. 
   
   ### Changes
   
   * Rather than using the bean encoder, I opted to use the Kryo encoder, which correctly
serializes the objects.
   * Added test cases for the defect.
   
   ## Pull Request Checklist
   
   - [ ] Is there a JIRA ticket associated with this PR? If not one needs to be created at
[Metron Jira](https://issues.apache.org/jira/browse/METRON/?selectedTab=com.atlassian.jira.jira-projects-plugin:summary-panel).
   - [ ] Does your PR title start with METRON-XXXX where XXXX is the JIRA number you are trying
to resolve? Pay particular attention to the hyphen "-" character.
   - [ ] Has your PR been rebased against the latest commit within the target branch (typically
master)?
   - [ ] Have you included steps to reproduce the behavior or problem that is being changed
or addressed?
   - [ ] Have you included steps or a guide to how the change may be verified and tested manually?
   - [ ] Have you ensured that the full suite of tests and checks have been executed in the
root metron folder via:
   - [ ] Have you written or updated unit tests and or integration tests to verify your changes?
   - [ ] If adding new dependencies to the code, are these dependencies licensed in a way
that is compatible for inclusion under [ASF 2.0](http://www.apache.org/legal/resolved.html#category-a)?
   - [ ] Have you verified the basic functionality of the build by building and running locally
with Vagrant full-dev environment or the equivalent?
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

Mime
View raw message