nifi-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matthew Clarke <matt.clarke....@gmail.com>
Subject Re: splitText output appears to be getting dropped
Date Fri, 19 Feb 2016 18:29:08 GMT
Conrad,
     The mergeContent processor will bin files based upon the configuration
you have configured.  Since it is taking multiple files and creating one
output file from them, that output file cannot have multiple filenames.
MergeContent will use the filename of the first file in the bin as the
filename of the output file.  As far as the rest of the attributes go from
the numerous source files, the 'Attribute Strategy' property in
MergeContent determines how they are applied to the new output file.

Matt

On Fri, Feb 19, 2016 at 11:25 AM, Conrad Crampton <
conrad.crampton@secdata.com> wrote:

> Hi,
> Perfect!
> I tried \n for linefeed – didn’t think of shift+enter!
>
> The reason I was updating filename early on in my flow file was just
> because I already had UpdateAttributes that was a handy place to do so. I
> can put it just before the PutFile though so no major issue, just wondered
> why this was happening and if it was be design (feature) or bug.
>
> Thanks
> Conrad
>
> From: Bryan Bende <bbende@gmail.com>
> Reply-To: "users@nifi.apache.org" <users@nifi.apache.org>
> Date: Friday, 19 February 2016 at 16:16
> To: "users@nifi.apache.org" <users@nifi.apache.org>
> Subject: Re: splitText output appears to be getting dropped
>
> Hello,
>
> MergeContent has properties for header, demarcator, and footer, and also
> has a strategy property which specifies whether these values come from a
> file or inline text.
>
> If you do inline text and specify a demarcator of a new line (shift +
> enter in the demarcator value) then binary concatenation will get you all
> of the lines merged together with new lines between them.
>
> As far as the file naming, can you just wait until after RouteContent to
> rename them? They just need be renamed before the PutFile, but it doesn't
> necessarily have to be before RouteOnContent.
>
> Let us know if that helps.
>
> Thanks,
>
> Bryan
>
>
> On Fri, Feb 19, 2016 at 11:01 AM, Conrad Crampton <
> conrad.crampton@secdata.com> wrote:
>
>> Hi,
>> Sorry to piggy back on this thread, but I have pretty much the same issue
>> – I am splitting log files -> routeoncontent (various paths) two of these
>> paths (including unmatched), basically need to just get farmed off into a
>> directory just in case they are needed later.
>> These go into a MergeContent processor where I would like to merge into
>> one file – each flowfile content as a line in the file delimited by line
>> feed (as like the original file), whichever way I try this though doesn’t
>> quite do what I want. If I try BinaryConcatenation the file ends up as one
>> long line, if TAR each Flowfile is a separate file in a TAR (not
>> unsurprisingly). There doesn’t seem to be anyway of merging flow file
>> content into one file (that ideally has similar functions to be able to
>> compress, specify number of files etc.)
>>
>> Another related question to the answer below (really helped me out with
>> same issue), however if I rename the filename early on in my process flow,
>> it appears to be changed back to its original at MergeContent processor
>> time so I have to put another UpdateAttributes step in after the Merge to
>> rename the filename.
>> The flow is
>>
>> UpdateAttributes -> RouteOnContent -> UpdateAttribute -> MergeContent ->
>> PutFile
>>              ^   ^^ ^
>>      |  | ||
>> Filename changed samesame reverted
>>
>> If I put an extra UpdateAttribute before PutFile then fine. Logging at
>> each of the above points shows filename updated to ${uuid}-${filename}, but
>> at reverted is back at filename.
>>
>> Any suggestions on particularly the first question??
>>
>> Thanks
>> Conrad
>>
>>
>>
>> From: Jeff Lord <jeffrey.lord@gmail.com>
>> Reply-To: "users@nifi.apache.org" <users@nifi.apache.org>
>> Date: Friday, 19 February 2016 at 03:22
>> To: "users@nifi.apache.org" <users@nifi.apache.org>
>> Subject: Re: splitText output appears to be getting dropped
>>
>> Matt,
>>
>> Thanks a bunch!
>> That did the trick.
>> Is there a better way to handle this out of curiosity? Than writing out a
>> single line into multiple files.
>> Each file contains a single string that will be used to build a url.
>>
>> -Jeff
>>
>> On Thu, Feb 18, 2016 at 6:00 PM, Matthew Clarke <
>> matt.clarke.138@gmail.com> wrote:
>>
>>> Jeff,
>>>       It appears you files are being dropped because your are
>>> auto-terminating the failure relationship on your putFile processor. When
>>> the splitText processor splits the file by lines every new file has the
>>> same filename as the original it came from. My guess is the first file is
>>> being worked to disk and all others are failing because a file of the same
>>> name already exists in target dir. Try adding an UpdateAttribute processor
>>> after the splitText to rename all the files. Easiest way is to append the
>>> files uuid to its filename.  I also do not recommend auto-terminating
>>> failure relationships except in rare cases.
>>>
>>> Matt
>>> On Feb 18, 2016 8:36 PM, "Jeff Lord" <jeffrey.lord@gmail.com> wrote:
>>>
>>>> I have a pretty simple flow where I query for a list of ids using
>>>> executeProcess and than pass that list along to splitText where I am trying
>>>> to split on each line to than dynamically build a url further down the line
>>>> using updateAttribute and so on.
>>>>
>>>> executeProcess -> splitText -> putFile
>>>>
>>>> For some reason I am only getting one file written with one line.
>>>> I would expect something more like 100 files each with one line.
>>>> Using the provenance reporter it appears that some of my items are
>>>> being dropped.
>>>>
>>>> Time02/18/2016 17:13:46.145 PST
>>>> Event DurationNo value set
>>>> Lineage Duration00:00:12.187
>>>> TypeDROP
>>>> FlowFile Uuid7fa42367-490d-4b54-a32f-d062a885474a
>>>> File Size14 bytes
>>>> Component Id3b37a828-ba2c-4047-ba7a-578fd0684ce6
>>>> Component NamePutFile
>>>> Component TypePutFile
>>>> DetailsAuto-Terminated by failure Relationship
>>>>
>>>> Any ideas on what I need to change here?
>>>>
>>>> Thanks in advance,
>>>>
>>>> Jeff
>>>>
>>>
>>
>>
>> ***This email originated outside SecureData***
>>
>> Click here <https://www.mailcontrol.com/sr/MZbqvYs5QwJvpeaetUwhCQ==> to
>> report this email as spam.
>>
>>
>> SecureData, combating cyber threats
>>
>> ------------------------------
>>
>> The information contained in this message or any of its attachments may
>> be privileged and confidential and intended for the exclusive use of the
>> intended recipient. If you are not the intended recipient any disclosure,
>> reproduction, distribution or other dissemination or use of this
>> communications is strictly prohibited. The views expressed in this email
>> are those of the individual and not necessarily of SecureData Europe Ltd.
>> Any prices quoted are only valid if followed up by a formal written quote.
>>
>> SecureData Europe Limited. Registered in England & Wales 04365896.
>> Registered Address: SecureData House, Hermitage Court, Hermitage Lane,
>> Maidstone, Kent, ME16 9NT
>>
>
>

Mime
View raw message