nifi-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Christopher Wilson <wilson...@gmail.com>
Subject Re: ReplaceText duplicates
Date Thu, 10 Sep 2015 16:16:39 GMT
The behavior I see is for the ExtractText -> ReplaceText path where the
attributes, secaudit.json, secaudit.json.0, and secaudit.json.1 are
concatenated into the payload (below).

What I expected was that the attribute, secaudit.json, would have replaced
the payload.  I've tried .0 and .1 as the replacement attribute and I still
see the same behavior.

{"priority": "INFO", "event_type": "identity.authenticate", "timestamp":
"2015-08-18 23:29:17.358460", "publisher_id": "identity.ip-10-0-0-60",
"payload": {"typeURI": "http://schemas.dmtf.org/cloud/audit/1.0/event",
"initiator": {"typeURI": "service/security/account/user", "host": {"agent":
"python-keystoneclient", "address": "10.0.0.60"}, "id":
"cbd0f5c99e774b31bc4d9988ddfb698c"}, "target": {"typeURI":
"service/security/account/user", "id":
"openstack:036bdbcd-39ce-4545-956d-2a1a2c88dd6b"}, "observer": {"typeURI":
"service/security", "id":
"openstack:7c1bef2a-c90d-4f15-aa12-ec14bb990c7b"}, "eventType": "activity",
"eventTime": "2015-08-18T23:29:17.358172+0000", "action": "authenticate",
"outcome": "success", "id":
"openstack:305e6c25-93ee-4897-ab87-20092d14db95"}, "message_id":
"8c5c8576-9850-4920-a1d5-1053e2c704d7"}{"priority": "INFO", "event_type":
"identity.authenticate", "timestamp": "2015-08-18 23:29:17.358460",
"publisher_id": "identity.ip-10-0-0-60", "payload": {"typeURI": "
http://schemas.dmtf.org/cloud/audit/1.0/event", "initiator": {"typeURI":
"service/security/account/user", "host": {"agent": "python-keystoneclient",
"address": "10.0.0.60"}, "id": "cbd0f5c99e774b31bc4d9988ddfb698c"},
"target": {"typeURI": "service/security/account/user", "id":
"openstack:036bdbcd-39ce-4545-956d-2a1a2c88dd6b"}, "observer": {"typeURI":
"service/security", "id":
"openstack:7c1bef2a-c90d-4f15-aa12-ec14bb990c7b"}, "eventType": "activity",
"eventTime": "2015-08-18T23:29:17.358172+0000", "action": "authenticate",
"outcome": "success", "id":
"openstack:305e6c25-93ee-4897-ab87-20092d14db95"}, "message_id":
"8c5c8576-9850-4920-a1d5-1053e2c704d7"}

-Chris

On Thu, Sep 10, 2015 at 11:55 AM, Bryan Bende <bbende@gmail.com> wrote:

> Chris,
>
> I've been playing around with your template, and as far as I can tell both
> routes (ExtractText+ReplaceText vs. just ReplaceText) are producing a
> FlowFile with the same content, the difference is in the attributes...
>
> For ExtractText + ReplaceText I see this:
>
> Key: 'secaudit.json'
> Value: '{"priority": "INFO", "event_type": "identity.authenticate",
> "timestamp": "2015-08-18 23:29:17.358460", "publisher_id":
> "identity.ip-10-0-0-60", "payload": {"typeURI": "
> http://schemas.dmtf.org/cloud/audit/1.0/event", "initiator": {"typeURI":
> "service/security/account/user", "host": {"agent": "python-keystoneclient",
> "address": "10.0.0.60"}, "id": "cbd0f5c99e774b31bc4d9988ddfb698c"},
> "target": {"typeURI": "service/security/account/user", "id":
> "openstack:036bdbcd-39ce-4545-956d-2a1a2c88dd6b"}, "observer": {"typeURI":
> "service/security", "id":
> "openstack:7c1bef2a-c90d-4f15-aa12-ec14bb990c7b"}, "eventType": "activity",
> "eventTime": "2015-08-18T23:29:17.358172+0000", "action": "authenticate",
> "outcome": "success", "id":
> "openstack:305e6c25-93ee-4897-ab87-20092d14db95"}, "message_id":
> "8c5c8576-9850-4920-a1d5-1053e2c704d7"}'
> Key: 'secaudit.json.0'
> Value: '{"priority": "INFO", "event_type": "identity.authenticate",
> "timestamp": "2015-08-18 23:29:17.358460", "publisher_id":
> "identity.ip-10-0-0-60", "payload": {"typeURI": "
> http://schemas.dmtf.org/cloud/audit/1.0/event", "initiator": {"typeURI":
> "service/security/account/user", "host": {"agent": "python-keystoneclient",
> "address": "10.0.0.60"}, "id": "cbd0f5c99e774b31bc4d9988ddfb698c"},
> "target": {"typeURI": "service/security/account/user", "id":
> "openstack:036bdbcd-39ce-4545-956d-2a1a2c88dd6b"}, "observer": {"typeURI":
> "service/security", "id":
> "openstack:7c1bef2a-c90d-4f15-aa12-ec14bb990c7b"}, "eventType": "activity",
> "eventTime": "2015-08-18T23:29:17.358172+0000", "action": "authenticate",
> "outcome": "success", "id":
> "openstack:305e6c25-93ee-4897-ab87-20092d14db95"}, "message_id":
> "8c5c8576-9850-4920-a1d5-1053e2c704d7"}'
> Key: 'secaudit.json.1'
> Value: '{"priority": "INFO", "event_type": "identity.authenticate",
> "timestamp": "2015-08-18 23:29:17.358460", "publisher_id":
> "identity.ip-10-0-0-60", "payload": {"typeURI": "
> http://schemas.dmtf.org/cloud/audit/1.0/event", "initiator": {"typeURI":
> "service/security/account/user", "host": {"agent": "python-keystoneclient",
> "address": "10.0.0.60"}, "id": "cbd0f5c99e774b31bc4d9988ddfb698c"},
> "target": {"typeURI": "service/security/account/user", "id":
> "openstack:036bdbcd-39ce-4545-956d-2a1a2c88dd6b"}, "observer": {"typeURI":
> "service/security", "id":
> "openstack:7c1bef2a-c90d-4f15-aa12-ec14bb990c7b"}, "eventType": "activity",
> "eventTime": "2015-08-18T23:29:17.358172+0000", "action": "authenticate",
> "outcome": "success", "id":
> "openstack:305e6c25-93ee-4897-ab87-20092d14db95"}, "message_id":
> "8c5c8576-9850-4920-a1d5-1053e2c704d7"}'
> --------------------------------------------------
> {"priority": "INFO", "event_type": "identity.authenticate", "timestamp":
> "2015-08-18 23:29:17.358460", "publisher_id": "identity.ip-10-0-0-60",
> "payload": {"typeURI": "http://schemas.dmtf.org/cloud/audit/1.0/event",
> "initiator": {"typeURI": "service/security/account/user", "host": {"agent":
> "python-keystoneclient", "address": "10.0.0.60"}, "id":
> "cbd0f5c99e774b31bc4d9988ddfb698c"}, "target": {"typeURI":
> "service/security/account/user", "id":
> "openstack:036bdbcd-39ce-4545-956d-2a1a2c88dd6b"}, "observer": {"typeURI":
> "service/security", "id":
> "openstack:7c1bef2a-c90d-4f15-aa12-ec14bb990c7b"}, "eventType": "activity",
> "eventTime": "2015-08-18T23:29:17.358172+0000", "action": "authenticate",
> "outcome": "success", "id":
> "openstack:305e6c25-93ee-4897-ab87-20092d14db95"}, "message_id":
> "8c5c8576-9850-4920-a1d5-1053e2c704d7"}
>
>
> The content/payload is the part below the --------------------, and the
> three attributes secaudit.json, secaudit.json.0, and secaudit.json.1 are
> the resulting attributes from ExtractText.
> The reason for those three attributes is that it puts the first match into
> an attribute with the name of the property you specified (secaudit.json),
> then it puts the entire match into index 0 (in case you had multiple
> capture groups this would have them all) then it puts each capture group
> after that starting with 1.
>
> For the ReplaceText by itself I see:
> ....
> --------------------------------------------------
> {"priority": "INFO", "event_type": "identity.authenticate", "timestamp":
> "2015-08-18 23:29:17.358460", "publisher_id": "identity.ip-10-0-0-60",
> "payload": {"typeURI": "http://schemas.dmtf.org/cloud/audit/1.0/event",
> "initiator": {"typeURI": "service/security/account/user", "host": {"agent":
> "python-keystoneclient", "address": "10.0.0.60"}, "id":
> "cbd0f5c99e774b31bc4d9988ddfb698c"}, "target": {"typeURI":
> "service/security/account/user", "id":
> "openstack:036bdbcd-39ce-4545-956d-2a1a2c88dd6b"}, "observer": {"typeURI":
> "service/security", "id":
> "openstack:7c1bef2a-c90d-4f15-aa12-ec14bb990c7b"}, "eventType": "activity",
> "eventTime": "2015-08-18T23:29:17.358172+0000", "action": "authenticate",
> "outcome": "success", "id":
> "openstack:305e6c25-93ee-4897-ab87-20092d14db95"}, "message_id":
> "8c5c8576-9850-4920-a1d5-1053e2c704d7"}
>
>
> Is this the same behavior you are seeing?
>
>
> -Bryan
>
>
> On Thu, Sep 10, 2015 at 11:22 AM, Matt Gilman <matt.c.gilman@gmail.com>
> wrote:
>
>> Chris,
>>
>> Since your dealing with JSON data, you may want to consider using
>> EvaluateJsonPath. It supports specifying XPath like expressions to extract
>> values and store into FlowFile attributes or content. If your extracting
>> into attributes, you can evaluate multiple paths. However, if your
>> extracting into FlowFile content you can only specify a single path.
>>
>> I'll take a look at your template to see what's going on.
>>
>> Matt
>>
>> On Thu, Sep 10, 2015 at 11:00 AM, Christopher Wilson <wilsoncj1@gmail.com
>> > wrote:
>>
>>> I've ran into an issue with ReplaceText on another thread but thought
>>> I'd move this over to it's own.
>>>
>>> What I have is a syslog entry from OpenStack that contains CADF (Cloud
>>> Audit Data Federation) JSON as the payload.  In the context of OpenStack
>>> these are login/security events that we'd like to see outside of a normal
>>> syslog stream and passed directly over to the security team.  I'd started
>>> down the path of ExtractText and pulling out the associated JSON into an
>>> attribute but found when I wired in a ReplaceText and tried to replace the
>>> content with the attribute 3 copies of the JSON data were written to the
>>> file content.
>>>
>>> What I've since learned is I can just replace the text in place without
>>> yanking into an attribute.  However, I can see cases where I might want to
>>> replace/append text using one or more attributes.  Wanted to see if other
>>> have handled this differently and if there is an enhancement request in the
>>> offing?
>>>
>>> I put the template I was working from, with a line of the syslog data,
>>> up on GitHub in case anyone wants to see this behavior in action.  You just
>>> have to play with turning processors on/off when viewing the full bulletin
>>> board.
>>>
>>> https://github.com/cj-wilson/NiFi-Templates
>>>
>>> Thanks in advance.
>>>
>>> -Chris
>>>
>>
>>
>

Mime
View raw message