nifi-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From James McMahon <jsmcmah...@gmail.com>
Subject Uncompressing nested tar, tar.gz, gz, and zip files
Date Thu, 15 Jun 2017 13:48:16 GMT
Hello. I have incoming directories of files that contain nested numbers of
tar, gz, zip, gzip, etc compressed files. The highest level arrives as a
tar, but from that point forward I may or may not find results from that
tar that include additional compressed files or not. My initial uncompress
of the highest level tar may simply return regular files to me.

Has anyone developed a workflow to handle such indeterminate nested
compressed files? My goal is to uncompress all so that I have a set
of atomic files to work with.

In my current workflow I use repeated chains of
IdentifyMimeType-->RouteOnAttribute-->isCompressed is true->UnpackContent
but though this works it is not practical to anticipate in such a fixed
manner the number of embedded compressed files I may have to handle.

Thanks in advance for your help. -Jim

Mime
View raw message