nifi-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mark Payne <marka...@hotmail.com>
Subject Re: Uncompressing nested tar, tar.gz, gz, and zip files
Date Thu, 15 Jun 2017 13:53:12 GMT
Jim,

I would recommend not repeating chains of those processors but rather just create a loop:

IdentifyMimeType [1] -> RouteOnAttribute -> gzip ? -> CompressContent -> Back
to IdentifyMimeType [1]
                                                                    -> tar or zip ? ->
UnpackContent -> Back to IdentifyMimeType [1]
                                                                    -> other ? --> 
[Continue on through rest of your flow]


Does that make sense?

Thanks
-Mark


> On Jun 15, 2017, at 9:48 AM, James McMahon <jsmcmahon3@gmail.com> wrote:
> 
> Hello. I have incoming directories of files that contain nested numbers of tar, gz, zip,
gzip, etc compressed files. The highest level arrives as a tar, but from that point forward
I may or may not find results from that tar that include additional compressed files or not.
My initial uncompress of the highest level tar may simply return regular files to me.
> 
> Has anyone developed a workflow to handle such indeterminate nested compressed files?
My goal is to uncompress all so that I have a set of atomic files to work with.
> 
> In my current workflow I use repeated chains of IdentifyMimeType-->RouteOnAttribute-->isCompressed
is true->UnpackContent
> but though this works it is not practical to anticipate in such a fixed manner the number
of embedded compressed files I may have to handle.
> 
> Thanks in advance for your help. -Jim


Mime
View raw message