hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mike Yoder (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-12942) hadoop credential commands non-obviously use password of "none"
Date Tue, 22 Mar 2016 19:28:25 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-12942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15207098#comment-15207098

Mike Yoder commented on HADOOP-12942:

I have heard reluctance from folks in the past for having commands prompt for passwords and
would certainly break the scriptability of it. We would have to add a switch that enabled
the prompting for a password - if we were to add it to the credential create subcommand.

Agreed. Today as you know the credential create command prompts for a password but there is
an undocumented "-value" argument that can be used.  I'd stick with the same scheme where
either a prompt or command line argument were possible.

This same password file is used in lots of scenarios though: KMS, javakeystore providers for
key provider API, oozie, signing secret providers,e tc. I wonder whether a separate command
for it would make sense.
Conceptually, yes, but aren't config values different?  I'm aware of two:
* alias/AbstractJavaKeyStoreProvider: hadoop.security.credstore.java-keystore-provider.password-file
* key/JavaKeyStoreProvider: hadoop.security.keystore.java-keystore-provider.password-file

Keep in mind that we would need to do a number of things for this.
1. prompt for the password
2. persist it
3. set appropriate permissions on the file
4. somehow determine the filename to use (probably based on the password file name configuration)
which would need to be provided by the user as well
5. allow for use of the same password file for multiple keystores or scenarios
6. allow for random-ish generated password without prompt
I think it's even more complicated. :-) The user could want to use the environment variable
when the credential is consumed, and so would want to provide it to the command but would
not want to deal with anything file-related. 

Also it's conceivable that the user could have constructed the file themselves; although this
doesn't seem particularly user friendly. 

So we have scenarios for hadoop credential create|list|etc that look like
# Here is the credstore password from a prompt
# Here is the credstore password on the command line
# The credstore password is already in a file in the "expected" location (set up either by
hand or via your new pwdfile command).

Making a command to manage the password file makes sense. I think that we shouldn't ask the
user to give it the property name though: you could modify KeyShell and CredentialShell to
have a new subcommand of 'pwdfile', thusly:
* hadoop credential pwdfile \[args\]
* hadoop key pwdfile \[args\]

And they could share an implementation. This way the user does not have to remember "hadoop.security.credstore.java-keystore-provider.password-file"
or the like. This also means that the provider selected needs a new interface to create said
file, if applicable.

I like the auto-generate-password option for the file. I think the default would be to still
prompt for the password, though.  So yeah, adding a pwdfile command seems like a good idea.

The thing about the existing design that I'm going back and forth on is that the CredentialShell
is high-level, and selects a provider and then simply passes information to the provider.
The password is implied and not passed directly, so the CredentialShell has no notion of whether
or not the underlying provider actually has a password or not.

So, for example, it would be daft of CredentialShell to accept a password on the command line
if one is provided in a file, and it would also be even more daft if no password was specifed
on the command line and the password wasn't in the password file either. Furthermore it would
be silly to accept a password when the underlying provider does not need a password at all
for proper operation (example: the UserProvider). There has to be some amount of communication
between the CredentialShell and the provider in order to get the "is a password required"
and "where precisely is the password" cases correct.  

To make this even more interesting, in the various providers with a key store, the keyStore
is either created or opened in the constructor, requiring that all the information be presented
up front - without scope for the back and forth of "do you need a password and where" from
the provider.

So... one way to deal with this is to move the keyStore.load() call out of the constructor
and defer it until the first get/set/delete credential entry call. Then expose interfaces
along the lines of "does this provider already have the password somehow?" and "set the password
directly". We'd have to add default behavior in CredentialProvider (and KeyProvider) and then
implement in the ones that matter.

The downside to this approach is that we move around a few error conditions. However everything
can throw an IOException, so maybe this isn't a big deal. Seem reasonable? Alternative proposals?

> hadoop credential commands non-obviously use password of "none"
> ---------------------------------------------------------------
>                 Key: HADOOP-12942
>                 URL: https://issues.apache.org/jira/browse/HADOOP-12942
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: security
>            Reporter: Mike Yoder
> The "hadoop credential create" command, when using a jceks provider, defaults to using
the value of "none" for the password that protects the jceks file.  This is not obvious in
the command or in documentation - to users or to other hadoop developers - and leads to jceks
files that essentially are not protected.
> In this example, I'm adding a credential entry with name of "foo" and a value specified
by the password entered:
> {noformat}
> # hadoop credential create foo -provider localjceks://file/bar.jceks
> Enter password: 
> Enter password again: 
> foo has been successfully created.
> org.apache.hadoop.security.alias.LocalJavaKeyStoreProvider has been updated.
> {noformat}
> However, the password that protects the file bar.jceks is "none", and there is no obvious
way to change that. The practical way of supplying the password at this time is something
akin to
> {noformat}
> HADOOP_CREDSTORE_PASSWORD=credpass hadoop credential create --provider ...
> {noformat}
> That is, stuffing HADOOP_CREDSTORE_PASSWORD into the environment of the command. 
> This is more than a documentation issue. I believe that the password ought to be _required_.
 We have three implementations at this point, the two JavaKeystore ones and the UserCredential.
The latter is "transient" which does not make sense to use in this context. The former need
some sort of password, and it's relatively easy to envision that any non-transient implementation
would need a mechanism by which to protect the store that it's creating.  
> The implementation gets interesting because the password in the AbstractJavaKeyStoreProvider
is determined in the constructor, and changing it after the fact would get messy. So this
probably means that the CredentialProviderFactory should have another factory method like
the first that additionally takes the password, and an additional constructor exist in all
the implementations that takes the password. 
> Then we just ask for the password in getCredentialProvider() and that gets passed down
to via the factory to the implementation. The code does have logic in the factory to try multiple
providers, but I don't really see how multiple providers would be rationaly be used in the
command shell context.
> This issue was brought to light when a user stored credentials for a Sqoop action in
Oozie; upon trying to figure out where the password was coming from we discovered it to be
the default value of "none".

This message was sent by Atlassian JIRA

View raw message