nifi-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Arturo Michel <Arturo.Mic...@leotech.com.sg>
Subject CreateHadoopSequenceFile Processor Key adding sf suffix
Date Wed, 11 May 2016 14:39:18 GMT
I am using the createHadoopSequenceFile processor to create a sequence file from incoming data
to effectively time stamp my data at this point, using the current time as the key and the
data as the value of the sequence file. I change the file name attribute (momentarily) to
${now()} as to get a sequence file where the key is the time and the content is the data.
However the processor adds the .sf suffix which makes it all the way to the key.


I end up with the following structure [40668712567.sf | [data bytes]].


I understand that the file is written as filename.sf but shouldn't the key omit the .sf suffix
and only be the file name?


Looking at the code in https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-hadoop-bundle/nifi-hdfs-processors/src/main/java/org/apache/nifi/processors/hadoop/CreateHadoopSequenceFile.java


155     final String fileName = flowFile.getAttribute(CoreAttributes.FILENAME.key()) + ".sf";
156     flowFile = session.putAttribute(flowFile, CoreAttributes.FILENAME.key(), fileName);
157        try {
158            flowFile = sequenceFileWriter.writeSequenceFile(flowFile, session, getConfiguration(),
compressionType);
159            session.transfer(flowFile, RELATIONSHIP_SUCCESS);
160            getLogger().info("Transferred flowfile {} to {}", new Object[]{flowFile, RELATIONSHIP_SUCCESS});
161        } catch (ProcessException e) {
162            getLogger().error("Failed to create Sequence File. Transferring {} to 'failure'",
new Object[]{flowFile}, e);
163            session.transfer(flowFile, RELATIONSHIP_FAILURE);
164        }



the file name is changed before passing the flow file to the writer. The default sequence
writer (and I think also the others) use the file name as received to write the key.


https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-hadoop-bundle/nifi-hdfs-processors/src/main/java/org/apache/nifi/processors/hadoop/SequenceFileWriterImpl.java


117 String key = flowFile.getAttribute(CoreAttributes.FILENAME.key()); 118 writer.append(new
Text(key), inStreamWritable);



I am trying to time stamp the data as the source system does not have that capability. Suggestions
around this issue are welcomed.



Best Regards.










This email is intended only for the individual or entity to which it is addressed and may
contain information that is private, restricted, confidential or secret and exempt from disclosure
under applicable law.
If the reader of this disclaimer is not the intended recipient, you are hereby notified that
any dissemination, distribution or copying of this document is strictly prohibited. If you
received this in error, please notify the sender and delete it immediately after reading this
disclaimer.
Thank you.




Mime
View raw message