nifi-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mark Payne <>
Subject Re: ValidateRecord Processor
Date Sun, 05 Nov 2017 21:06:48 GMT
Hey Paul,

So a FlowFile consists only of Attributes and a Stream of bytes. In order for the ValidateRecord
Processor to validate the data, it needs to convert that data from a stream of bytes into
object that it can work with. This is the responsibility of the Record Reader - to take a
bunch of
bytes and create one or more Record objects. The processor is then responsible for sending
Record objects on to the next processor in the flow. To do that, it has to take those Record
and convert them back into a stream of bytes. And this is the job of Record Writer - to take
one or
more Record objects and convert them into a stream of bytes (i.e., serialize them).

So if there were no Record Writer, then the processor would not be able to convert the
Record objects into streams of bytes.

Does this help to clarify things, or only muddy the water worse? :)


On Nov 5, 2017, at 3:53 PM, Paul Riddle <<>>

Hi Mark!

Thanks for the fast response.  That does make sense.  Since I am not making any modifications,
just validating against a given schema, there is nothing for the Record Writer to do.  I am
still a little confused as to why it is a required Property in the ValidateRecord processor,


On Sun, Nov 5, 2017 at 3:46 PM, Mark Payne <<>>
Hey Paul,

That is accurate - the Record Writer chosen will not affect the validation process.
The way that the processor works is to read in records, one at a time, from a FlowFile.
Once a record has been read, it is validated against the given schema. It is then written
to either the 'valid' relationship or the 'invalid' relationship. When this happens, the chosen
Record Writer is used to write it out.

So it would be very common to have a CSV Reader with a CSV Writer or a JSON Reader
with a JSON Writer, for instance. However, you could also configure a CSV Reader with
a JSON Writer, and it will essentially convert the record for you inline.

This is a very common pattern for the record-oriented processors, because the records are
read in, parsed, and turned into a 'Record' object. Once this has happened, we can treat that
Record object the same, whether it was parsed from a CSV file, a JSON file, or some custom
format. This, of course, provides us with some very powerful, reusable processors! Once we've
finished working with that Record object, though, we need to pass it on in some way. So we
use of a Record Writer to serialize it back out.

Does that all make sense?


On Nov 5, 2017, at 3:24 PM, Paul Riddle <<>>

Hello All,

In regards to the NiFi 1.4 ValidateRecord processor, it doesn't appear to matter what Record
Writer I choose.  As long as the Record Reader can read the incoming flowfile and the Schema
Access Strategy validates my flowfile, it comes out the "valid" relationship.

Am I missing some other purpose for the Record Writer property in the ValidateRecord Processor?
 If so I would like to understand it better.



View raw message