commons-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (COMPRESS-459) CPIO fails decoding multibyte name entries
Date Wed, 11 Jul 2018 12:54:08 GMT

    [ https://issues.apache.org/jira/browse/COMPRESS-459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16540049#comment-16540049
] 

ASF GitHub Bot commented on COMPRESS-459:
-----------------------------------------

Github user bodewig commented on the issue:

    https://github.com/apache/commons-compress/pull/67
  
    `ZipEncoding` is a better choice for our internal use as it is used to encode the name
(and deals with "use null as the encoding to use the platform's default encoding" transparently).
You don't have to make the changes yourself (but if you do, please add spaces around operators
:-) ), but I won't complain if you are faster than me.


> CPIO fails decoding multibyte name entries
> ------------------------------------------
>
>                 Key: COMPRESS-459
>                 URL: https://issues.apache.org/jira/browse/COMPRESS-459
>             Project: Commons Compress
>          Issue Type: Bug
>          Components: Archivers
>    Affects Versions: 1.9, 1.17
>            Reporter: Jens Reimann
>            Priority: Major
>              Labels: patch-available
>             Fix For: 1.18
>
>
> Having a CPIO archive in (e.g. UTF-8) mode and having a name entry with a name containing
multi-byte characters the decoder fails.
> The problem IMHO is the "getHeaderPadCount" method, which assumes a single byte per character:
>  
> {code:java}
>     public int getHeaderPadCount(){
>         if (this.alignmentBoundary == 0) { return 0; }
>         int size = this.headerSize + 1;  // Name has terminating null
>         if (name != null) {
>             size += name.length();
>         }
>         final int remain = size % this.alignmentBoundary;
>         if (remain > 0){
>             return this.alignmentBoundary - remain;
>         }
>         return 0;
>     }
> {code}
> However this may (or may not) be true for UTF-8.
>  
> Also it wouldn't be enough to call "String#getBytes(…)" as this might already transform
the underlying bytes.
> The proper solution would be to provide the name size, as read from the CPIO stream,
and pass it to the entry.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message