commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From sebb <>
Subject Re: [COMPRESS] Add getArchiveType() method to [Archive|Compressor]InputStream classes?
Date Fri, 14 May 2010 10:08:01 GMT
On 14/05/2010, Stefan Bodewig <> wrote:
> On 2010-05-14, sebb <> wrote:
>  > On 14/05/2010, Stefan Bodewig <> wrote:
>  >> On 2010-05-12, sebb <> wrote:
>  >>> Can we restart this?
>  >>> Compress currently has Archiver and Compressor InputStreamFactory
>  >>> classes which have the following signatures:
>  >>> public ArchiveInputStream createArchiveInputStream(
>  >>>             final String archiverName, final InputStream in)
>  >>>  public CompressorInputStream createCompressorInputStream(final String
>  >>>             final InputStream in)
>  >>> The archiverName or name parameters are used to determine which stream
>  >>> class to create, and should use one of the provided public constants.
>  >>> I just want to provide the inverse conversion, i.e. tag the created
>  >>> class with the key that was used to create it, so that this can be
>  >>> easily determined later e.g. if the autodetect mode is used.
>  >> In theory the inverse is not unique - the same implementation could be
>  >>  used for several names.  It may become true in practice if we'd choose
>  >>  to use different names for variants of archive types like pax, posixtar
>  >>  and gnutar for tar.
>  >>  I think it depends on what you want the method that returns the archive
>  >>  type/name (s) to reflect.
>  >>   * The archive names you could pass into the create methods to obtain an
>  >>    instance of the same stream class?  Then the method actually should be
>  >>    inside the factory rather than the stream class since only the
>  >>    factory can know that for sure.
>  > Yes, but the factory can pass in the name to the created stream.
> What if I don't create the stream via the factory?

Then the default name is used.

>  To make this work all implementations would need a new constructor with
>  an additional argument - that wouldn't make any sense for users who
>  bypass the factory.

Only for classes that support multiple formats, but then the ctor
needs to be told which it is handling anyway.

>  If this is the purpose of the method I'd prefer to see it inside the
>  factory, some getArchiverNames(Class) or
>  getArchiverNames(ArchiveInputStream) that returned an immutable list of
>  strings.
>  We could still put a method into ArchiveInputStream that queried the
>  factory passing in itself.
>  >>   * The format of the current stream?  In that case there can't be any
>  >>    reasonable default (other than asking the factory and potentially
>  >>    getting back a list of names rather than a single one).
>  > That would not be necessary if every IS that is created is given its
>  > name via the ctor.
> Assume one we have multiple names for one implementation.  For example
>  old_ascii, old_binary, new_portable and new_portable_crc for cpio since
>  these really are different formats.  If we use autodetection - which is
>  why you want the method in the first place IIUC - then which name do you
>  expect the factory to pass into said constructor?

cpio + old_ascii etc (see below).

>  Only the implementation knows the real format it has detected.

At present, the implementations *don't* actually know, as they all
process format differences on the fly -. they don't check that the
same format is being used throught the file and each record may have
different settings so long as these are self-consistent.
This is perhaps a bug.

Only the factory currently knows which version of a format is being
used, because the matching is performed in the factory code. And that
information is currently lost once the input stream has been created.

If we do wish the IS classes to validate the format, then the classes
will need to be created accordingly, and can therefore return their
specific format.

But I think a class that processes various versions of cpio should
return "cpio" as its fundamental type; it may also return "old_ascii"
etc as the version of the type.

>  So if this is the purpose of the new method then it has to live inside
>  the implementation and I don't see which default implementation could
>  make sense.

Apart from Zip/Jar, all the current classes only support a single type.

>  >>> It might also be useful for the classes to provide access to their
>  >>> associated mime type(s) and "usual" file extension(s). These should
>  >>> both be read-only Lists.
>  >> OK.
>  >>  This isn't anything the factory could do (unless we broaden its
>  >>  contract) nor the base class could provide a reasonable default for.
>  > The base class could define an abstract method.
>  > The only possible defaults would be empty Lists.
>  > These would only be useful in the sense that subclasses would not need
>  > to implement the methods.
> That's what I was trying to say - the default of an empty list is not
>  reasonable ("it simply doesn't make sense" in case I'm just using the
>  wrong English word for what I mean).

We could define empty list to mean "unknown".

>  Stefan
>  ---------------------------------------------------------------------
>  To unsubscribe, e-mail:
>  For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message