flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From vijikarthi <...@git.apache.org>
Subject [GitHub] flink issue #2275: FLINK-3929 Support for Kerberos Authentication with Keyta...
Date Thu, 21 Jul 2016 17:43:25 GMT
Github user vijikarthi commented on the issue:

    https://github.com/apache/flink/pull/2275
  
    Adding some more cotext to the implementation details. which is based on the design proposal
(https://docs.google.com/document/d/1-GQB6uVOyoaXGwtqwqLV8BHDxWiMO2WnVzBoJ8oPaAs/edit?usp=sharing)
    
    Current security implementation works in a subtle way utilizing the Keberos cache of the
user who starts Flink process/jobs and only in the context of supporting secure access to
Hadoop cluster. The underlying UGI implementation of Hadoop infrastructure is used to harden
the security using the keytab cache. For Yarn mode of deployment, delegation tokens are created
and populated to container environment (App Master/JM and TM). 
    
    There are two areas of improvement that current implementation lacks:
    1) Tokens will be expired in due course and hence it impacts long running jobs
    2) Missing functionality to support secure connection to Kafka and ZK (Kafka 0.9 and latest
ZK versions are supporting kerberos based authentication using SASL/JAAS)
    
    This PR addresses above gaps by providing Keytab support to securely communicate to Hadoop
and Kafka/ZK services.
    
    1) Additional Configurations: 
    
    Below new security specific configurations are added to the Flink configuration file.
    a) security.principal - user principal that Flink process/connectors should authenticate
as 
    b) security.keytab - keytab file location
    
    In standlone mode, it is assumed that the configurations pre-exists (manual process) on
all cluster nodes from where the JM and TMs will be running. 
    
    In Yarn mode, the configuration (and keytab file) is expected only on the node from where
YarnCLI or FlinkCLI will be invoked. Application code takes care of copying Keytab file to
JM/TM Yarn containers as local resource for lookup.
    
    In the absence of providing security configurations, the delegation token mechanism still
works to support backward compatibility (manual kinit before starting JM/TMs).
    
    2) Process-wide in-memory JAAS configuration to enable Kafka/ZK secure authentication.
     
    The JAAS configuration plays a critical role in authentication for Kerberized application.
Kafka/ZK login module code is expected to construct a login context based on supplied JAAS
configuration file entries and authenticates to produce a subject.  The context is constructed
with an application name which acts as a lookup key into the configuration, yielding one or
more login modules.   The login module implements the specific strategy, such as using a configured
keytab or using the user’s ticket cache.
    
    Instead of managing per-connector JAAS configuration file, a process-wide JAAS configuration
object is initialized during Flink bootstrap phase, thus providing a singular login module
to all callers configured to login using the supplied keytab.
    (https://docs.oracle.com/javase/7/docs/api/javax/security/auth/login/Configuration.html#setConfiguration(javax.security.auth.login.Configuration)
    
    To summarize, following sequence happens when the secure configuration is enabled.
    Flink bootstrap code (both Yarn and Standalone) initializes security context by
    a) Initializing UGI with the supplied keytab and principal which takes care of handling
Kerberos authentication and login renewal for Hadoop services. 
    b) Creating process-wide JAAS configuration object for Kafka/ZK login modules to support
Kerberos/SASL authentication. Login renewals are automatically taken care by ZK and Kafka
login module implementation.
    
    Some additional details are provided in the documentation page as well that can be referenced
from here.
    (https://github.com/vijikarthi/flink/blob/FLINK-3929/docs/internals/flink_security.md)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

Mime
View raw message