hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Colin Patrick McCabe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-13079) dfs -ls -q prints non-printable characters
Date Thu, 05 May 2016 03:43:12 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-13079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15271848#comment-15271848

Colin Patrick McCabe commented on HADOOP-13079:

bq. a) It's not standardized behavior amongst all of the platforms that Apache Hadoop runs

Linux, OpenBSD, FreeBSD, and OS X pick the behavior of hiding control characters in {{ls}}
by default.  That may not be "all the platforms that Apache Hadoop runs on," but it's certainly
the vast majority of real-world deployments.  The remaining important platform, Windows, doesn't
deal with terminals and control characters in quite the same way, so is probably not vulnerable
in any case.

In any case, the fact that the behavior isn't standardized is not a valid argument either
way.  Clearly Hadoop needs to pick one behavior or the other.  Lack of standardization doesn't
dictate that we have to pick one behavior or the other.  And certainly it doesn't dictate
that we should pick an unpopular and surprising behavior that almost nobody has experience

bq. b) It's not expected behavior relative to the rest of Apache Hadoop

The fact that one component has a security bug doesn't dictate that the other components also
need to have the same security bug.  This is like arguing that we can't fix a buffer overflow
in one component because then it wouldn't match all the other buffer-overflowable components.

bq. c) It's not feasible to actually make it expected behavior compared to the rest of Apache
Hadoop given the proliferation of places where raw file and directory names are printed to
the console

The only places we've discussed here are ls and fsck.  Perhaps there are more, but it hardly
seems infeasible to change them based on what we've talked about so far.  Perhaps log files
are also an issue, but only for people who tail the log file of the server.  And to reiterate,
a security flaw in X doesn't mean we should reproduce the same security flaw in Y.

At the end of the day, this is a security vulnerability and it needs to be fixed.  I asked
you before: "Should the filename be able use control characters to hijack the admin's GNU
screen session and execute arbitrary code? I would say no, what do you say?"  I would repeat
the same question again.

I understand that you have a personal preference for running without {{\-q}}.  However, it
is not constructive to -1 a patch fixing a security vulnerability without suggesting an alternate
way of fixing that vulnerability.  If this stays unfixed, it will probably get a CVE number.

> dfs -ls -q prints non-printable characters
> ------------------------------------------
>                 Key: HADOOP-13079
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13079
>             Project: Hadoop Common
>          Issue Type: Bug
>    Affects Versions: 2.6.0
>            Reporter: John Zhuge
>            Assignee: John Zhuge
> Add option {{-q}} to "hdfs dfs -ls" to print non-printable characters as "?". Non-printable
characters are defined by [isprint(3)|http://linux.die.net/man/3/isprint] according to the
current locale.
> Default to {{-q}} behavior on terminal; otherwise, print raw characters. See the difference
in these 2 command lines:
> * {{hadoop fs -ls /dir}}
> * {{hadoop fs -ls /dir | od -c}}
> In C, {{isatty(STDOUT_FILENO)}} is used to find out whether the output is a terminal.
Since Java doesn't have {{isatty}}, I will use JNI to call C {{isatty()}} because the closest
test {{System.console() == null}} does not work in some cases.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

View raw message