nifi-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bryan Bende <bbe...@gmail.com>
Subject Re: ReplaceText duplicates
Date Thu, 10 Sep 2015 15:55:14 GMT
Chris,

I've been playing around with your template, and as far as I can tell both
routes (ExtractText+ReplaceText vs. just ReplaceText) are producing a
FlowFile with the same content, the difference is in the attributes...

For ExtractText + ReplaceText I see this:

Key: 'secaudit.json'
Value: '{"priority": "INFO", "event_type": "identity.authenticate",
"timestamp": "2015-08-18 23:29:17.358460", "publisher_id":
"identity.ip-10-0-0-60", "payload": {"typeURI": "
http://schemas.dmtf.org/cloud/audit/1.0/event", "initiator": {"typeURI":
"service/security/account/user", "host": {"agent": "python-keystoneclient",
"address": "10.0.0.60"}, "id": "cbd0f5c99e774b31bc4d9988ddfb698c"},
"target": {"typeURI": "service/security/account/user", "id":
"openstack:036bdbcd-39ce-4545-956d-2a1a2c88dd6b"}, "observer": {"typeURI":
"service/security", "id":
"openstack:7c1bef2a-c90d-4f15-aa12-ec14bb990c7b"}, "eventType": "activity",
"eventTime": "2015-08-18T23:29:17.358172+0000", "action": "authenticate",
"outcome": "success", "id":
"openstack:305e6c25-93ee-4897-ab87-20092d14db95"}, "message_id":
"8c5c8576-9850-4920-a1d5-1053e2c704d7"}'
Key: 'secaudit.json.0'
Value: '{"priority": "INFO", "event_type": "identity.authenticate",
"timestamp": "2015-08-18 23:29:17.358460", "publisher_id":
"identity.ip-10-0-0-60", "payload": {"typeURI": "
http://schemas.dmtf.org/cloud/audit/1.0/event", "initiator": {"typeURI":
"service/security/account/user", "host": {"agent": "python-keystoneclient",
"address": "10.0.0.60"}, "id": "cbd0f5c99e774b31bc4d9988ddfb698c"},
"target": {"typeURI": "service/security/account/user", "id":
"openstack:036bdbcd-39ce-4545-956d-2a1a2c88dd6b"}, "observer": {"typeURI":
"service/security", "id":
"openstack:7c1bef2a-c90d-4f15-aa12-ec14bb990c7b"}, "eventType": "activity",
"eventTime": "2015-08-18T23:29:17.358172+0000", "action": "authenticate",
"outcome": "success", "id":
"openstack:305e6c25-93ee-4897-ab87-20092d14db95"}, "message_id":
"8c5c8576-9850-4920-a1d5-1053e2c704d7"}'
Key: 'secaudit.json.1'
Value: '{"priority": "INFO", "event_type": "identity.authenticate",
"timestamp": "2015-08-18 23:29:17.358460", "publisher_id":
"identity.ip-10-0-0-60", "payload": {"typeURI": "
http://schemas.dmtf.org/cloud/audit/1.0/event", "initiator": {"typeURI":
"service/security/account/user", "host": {"agent": "python-keystoneclient",
"address": "10.0.0.60"}, "id": "cbd0f5c99e774b31bc4d9988ddfb698c"},
"target": {"typeURI": "service/security/account/user", "id":
"openstack:036bdbcd-39ce-4545-956d-2a1a2c88dd6b"}, "observer": {"typeURI":
"service/security", "id":
"openstack:7c1bef2a-c90d-4f15-aa12-ec14bb990c7b"}, "eventType": "activity",
"eventTime": "2015-08-18T23:29:17.358172+0000", "action": "authenticate",
"outcome": "success", "id":
"openstack:305e6c25-93ee-4897-ab87-20092d14db95"}, "message_id":
"8c5c8576-9850-4920-a1d5-1053e2c704d7"}'
--------------------------------------------------
{"priority": "INFO", "event_type": "identity.authenticate", "timestamp":
"2015-08-18 23:29:17.358460", "publisher_id": "identity.ip-10-0-0-60",
"payload": {"typeURI": "http://schemas.dmtf.org/cloud/audit/1.0/event",
"initiator": {"typeURI": "service/security/account/user", "host": {"agent":
"python-keystoneclient", "address": "10.0.0.60"}, "id":
"cbd0f5c99e774b31bc4d9988ddfb698c"}, "target": {"typeURI":
"service/security/account/user", "id":
"openstack:036bdbcd-39ce-4545-956d-2a1a2c88dd6b"}, "observer": {"typeURI":
"service/security", "id":
"openstack:7c1bef2a-c90d-4f15-aa12-ec14bb990c7b"}, "eventType": "activity",
"eventTime": "2015-08-18T23:29:17.358172+0000", "action": "authenticate",
"outcome": "success", "id":
"openstack:305e6c25-93ee-4897-ab87-20092d14db95"}, "message_id":
"8c5c8576-9850-4920-a1d5-1053e2c704d7"}


The content/payload is the part below the --------------------, and the
three attributes secaudit.json, secaudit.json.0, and secaudit.json.1 are
the resulting attributes from ExtractText.
The reason for those three attributes is that it puts the first match into
an attribute with the name of the property you specified (secaudit.json),
then it puts the entire match into index 0 (in case you had multiple
capture groups this would have them all) then it puts each capture group
after that starting with 1.

For the ReplaceText by itself I see:
....
--------------------------------------------------
{"priority": "INFO", "event_type": "identity.authenticate", "timestamp":
"2015-08-18 23:29:17.358460", "publisher_id": "identity.ip-10-0-0-60",
"payload": {"typeURI": "http://schemas.dmtf.org/cloud/audit/1.0/event",
"initiator": {"typeURI": "service/security/account/user", "host": {"agent":
"python-keystoneclient", "address": "10.0.0.60"}, "id":
"cbd0f5c99e774b31bc4d9988ddfb698c"}, "target": {"typeURI":
"service/security/account/user", "id":
"openstack:036bdbcd-39ce-4545-956d-2a1a2c88dd6b"}, "observer": {"typeURI":
"service/security", "id":
"openstack:7c1bef2a-c90d-4f15-aa12-ec14bb990c7b"}, "eventType": "activity",
"eventTime": "2015-08-18T23:29:17.358172+0000", "action": "authenticate",
"outcome": "success", "id":
"openstack:305e6c25-93ee-4897-ab87-20092d14db95"}, "message_id":
"8c5c8576-9850-4920-a1d5-1053e2c704d7"}


Is this the same behavior you are seeing?


-Bryan


On Thu, Sep 10, 2015 at 11:22 AM, Matt Gilman <matt.c.gilman@gmail.com>
wrote:

> Chris,
>
> Since your dealing with JSON data, you may want to consider using
> EvaluateJsonPath. It supports specifying XPath like expressions to extract
> values and store into FlowFile attributes or content. If your extracting
> into attributes, you can evaluate multiple paths. However, if your
> extracting into FlowFile content you can only specify a single path.
>
> I'll take a look at your template to see what's going on.
>
> Matt
>
> On Thu, Sep 10, 2015 at 11:00 AM, Christopher Wilson <wilsoncj1@gmail.com>
> wrote:
>
>> I've ran into an issue with ReplaceText on another thread but thought I'd
>> move this over to it's own.
>>
>> What I have is a syslog entry from OpenStack that contains CADF (Cloud
>> Audit Data Federation) JSON as the payload.  In the context of OpenStack
>> these are login/security events that we'd like to see outside of a normal
>> syslog stream and passed directly over to the security team.  I'd started
>> down the path of ExtractText and pulling out the associated JSON into an
>> attribute but found when I wired in a ReplaceText and tried to replace the
>> content with the attribute 3 copies of the JSON data were written to the
>> file content.
>>
>> What I've since learned is I can just replace the text in place without
>> yanking into an attribute.  However, I can see cases where I might want to
>> replace/append text using one or more attributes.  Wanted to see if other
>> have handled this differently and if there is an enhancement request in the
>> offing?
>>
>> I put the template I was working from, with a line of the syslog data, up
>> on GitHub in case anyone wants to see this behavior in action.  You just
>> have to play with turning processors on/off when viewing the full bulletin
>> board.
>>
>> https://github.com/cj-wilson/NiFi-Templates
>>
>> Thanks in advance.
>>
>> -Chris
>>
>
>

Mime
View raw message