sqoop-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ping Wang (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (SQOOP-2894) Hive import with Parquet failed in Kerberos enabled cluster
Date Fri, 25 Mar 2016 06:02:25 GMT

    [ https://issues.apache.org/jira/browse/SQOOP-2894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15211465#comment-15211465
] 

Ping Wang edited comment on SQOOP-2894 at 3/25/16 6:01 AM:
-----------------------------------------------------------

I found Parquet support via SQOOP-1390 used Kite SDK Api.  By the stack trace, kitesdk.data.spi.hive.MetaStoreUtil
called HiveMetaStoreClient.  There's a Kite bug KITE-1014 for "Fix support for Hive datasets
on Kerberos enabled clusters." on version 1.1.0. Sqoop 1.4.6 is using Kite 1.0.0 without this
fix. 

I made the changes below:
1) Upgrade Kite to latest version for Sqoop dependency 
2) From Sqoop side,  add the hive configuration and send it to Kite 

After the two fix, the error above in this bug is gone.  But a new problem occurred:
... ...
2016-03-24 21:21:12,647 DEBUG [main] org.apache.hadoop.security.UserGroupInformation: hadoop
login
2016-03-24 21:21:12,649 DEBUG [main] org.apache.hadoop.security.UserGroupInformation: hadoop
login commit
2016-03-24 21:21:12,650 DEBUG [main] org.apache.hadoop.security.UserGroupInformation: using
kerberos user:ambari-qa@XXX.COM
2016-03-24 21:21:12,650 DEBUG [main] org.apache.hadoop.security.UserGroupInformation: Using
user: "ambari-qa@XXX.COM" with name ambari-qa@XXX.COM
2016-03-24 21:21:12,650 DEBUG [main] org.apache.hadoop.security.UserGroupInformation: User
entry: "ambari-qa@XXX.COM"
2016-03-24 21:21:12,657 DEBUG [main] org.apache.hadoop.security.UserGroupInformation: UGI
loginUser:ambari-qa@XXX.COM (auth:KERBEROS)
2016-03-24 21:21:12,657 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Executing
with tokens:
2016-03-24 21:21:12,657 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Kind:
YARN_AM_RM_TOKEN, Service: , Ident: (appAttemptId { application_id { id: 6 cluster_timestamp:
1458832712880 } attemptId: 1 } keyId: 1898169565)
2016-03-24 21:21:12,752 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Kind:
HDFS_DELEGATION_TOKEN, Service: 9.30.151.107:8020, Ident: (HDFS_DELEGATION_TOKEN token 38
for ambari-qa)
2016-03-24 21:21:12,753 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Kind:
TIMELINE_DELEGATION_TOKEN, Service: 9.30.151.107:8188, Ident: (owner=ambari-qa, renewer=yarn,
realUser=, issueDate=1458879665842, maxDate=1459484465842, sequenceNumber=32, masterKeyId=37)
2016-03-24 21:21:12,672 DEBUG [TGT Renewer for ambari-qa@XXX.COM] org.apache.hadoop.security.UserGroupInformation:
Found tgt Ticket (hex) = 
... ...
2016-03-24 21:21:13,728 DEBUG [main] org.kitesdk.data.spi.filesystem.FileSystemDatasetRepository:
Loading dataset: hhh555
2016-03-24 21:21:13,823 INFO [main] hive.metastore: Trying to connect to metastore with URI
thrift://xxx:9083
2016-03-24 21:21:13,830 DEBUG [main] org.apache.hadoop.security.UserGroupInformation: PrivilegedAction
as:ambari-qa (auth:SIMPLE) from:org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:403)
2016-03-24 21:21:13,858 DEBUG [main] org.apache.hadoop.security.UserGroupInformation: PrivilegedAction
as:ambari-qa (auth:SIMPLE) from:org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport.open(TUGIAssumingTransport.java:49)
2016-03-24 21:21:13,858 DEBUG [main] org.apache.thrift.transport.TSaslTransport: opening transport
org.apache.thrift.transport.TSaslClientTransport@25f7391e
2016-03-24 21:21:13,864 ERROR [main] org.apache.thrift.transport.TSaslTransport: SASL negotiation
failure
javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials
provided (Mechanism level: Failed to find any Kerberos tgt)]
    at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:211)
    at org.apache.thrift.transport.TSaslClientTransport.handleSaslStartMessage(TSaslClientTransport.java:94)
    at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:271)
    at org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:37)
    at org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:52)
    at org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:49)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
    at org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport.open(TUGIAssumingTransport.java:49)
    at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:432)
    at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:237)
    at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:182)
    at org.kitesdk.data.spi.hive.MetaStoreUtil.<init>(MetaStoreUtil.java:89)
    at org.kitesdk.data.spi.hive.HiveAbstractMetadataProvider.getMetaStoreUtil(HiveAbstractMetadataProvider.java:63)
    at org.kitesdk.data.spi.hive.HiveAbstractMetadataProvider.resolveNamespace(HiveAbstractMetadataProvider.java:270)
    at org.kitesdk.data.spi.hive.HiveAbstractMetadataProvider.resolveNamespace(HiveAbstractMetadataProvider.java:255)
    at org.kitesdk.data.spi.hive.HiveAbstractMetadataProvider.load(HiveAbstractMetadataProvider.java:102)
    at org.kitesdk.data.spi.filesystem.FileSystemDatasetRepository.load(FileSystemDatasetRepository.java:197)
    at org.kitesdk.data.Datasets.load(Datasets.java:108)
    at org.kitesdk.data.Datasets.load(Datasets.java:165)
    at org.kitesdk.data.mapreduce.DatasetKeyOutputFormat.load(DatasetKeyOutputFormat.java:542)
    at org.kitesdk.data.mapreduce.DatasetKeyOutputFormat.getOutputCommitter(DatasetKeyOutputFormat.java:505)
    at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$1.call(MRAppMaster.java:476)
    at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$1.call(MRAppMaster.java:458)
    at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.callWithJobClassLoader(MRAppMaster.java:1560)
    at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.createOutputCommitter(MRAppMaster.java:458)
    at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceInit(MRAppMaster.java:377)
    at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
    at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$4.run(MRAppMaster.java:1518)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
    at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1515)
    at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1448)
Caused by: GSSException: No valid credentials provided (Mechanism level: Failed to find any
Kerberos tgt)
    at sun.security.jgss.krb5.Krb5InitCredential.getInstance(Krb5InitCredential.java:147)
    at sun.security.jgss.krb5.Krb5MechFactory.getCredentialElement(Krb5MechFactory.java:122)
    at sun.security.jgss.krb5.Krb5MechFactory.getMechanismContext(Krb5MechFactory.java:187)
    at sun.security.jgss.GSSManagerImpl.getMechanismContext(GSSManagerImpl.java:224)
    at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:212)
    at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:179)
    at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:192)
    ... 34 more




was (Author: wpwang):
I found Parquet support via SQOOP-1390 used Kite SDK Api.  By the stack trace, kitesdk.data.spi.hive.MetaStoreUtil
called HiveMetaStoreClient.  There's a Kite bug KITE-1014 for "Fix support for Hive datasets
on Kerberos enabled clusters." on version 1.1.0. Sqoop 1.4.6 is using Kite 1.0.0 without this
fix. 

I made the changes below:
1) Upgrade Kite to latest version for Sqoop dependency 
2) From Sqoop side,  add the hive configuration and send it to Kite 

After the two fix, the error above in this bug is gone.  But a new problem occurred.  The
Kerberos authentication 

sqoop import --connect jdbc:db2://xxx:50000/testdb --username xxx--password xxx--table users
--hive-import -hive-table hhh555 --as-parquetfile -m 1 --verbose
... ...
2016-03-24 21:21:12,647 DEBUG [main] org.apache.hadoop.security.UserGroupInformation: hadoop
login
2016-03-24 21:21:12,649 DEBUG [main] org.apache.hadoop.security.UserGroupInformation: hadoop
login commit
2016-03-24 21:21:12,650 DEBUG [main] org.apache.hadoop.security.UserGroupInformation: using
kerberos user:ambari-qa@XXX.COM
2016-03-24 21:21:12,650 DEBUG [main] org.apache.hadoop.security.UserGroupInformation: Using
user: "ambari-qa@XXX.COM" with name ambari-qa@XXX.COM
2016-03-24 21:21:12,650 DEBUG [main] org.apache.hadoop.security.UserGroupInformation: User
entry: "ambari-qa@XXX.COM"
2016-03-24 21:21:12,657 DEBUG [main] org.apache.hadoop.security.UserGroupInformation: UGI
loginUser:ambari-qa@XXX.COM (auth:KERBEROS)
2016-03-24 21:21:12,657 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Executing
with tokens:
2016-03-24 21:21:12,657 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Kind:
YARN_AM_RM_TOKEN, Service: , Ident: (appAttemptId { application_id { id: 6 cluster_timestamp:
1458832712880 } attemptId: 1 } keyId: 1898169565)
2016-03-24 21:21:12,752 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Kind:
HDFS_DELEGATION_TOKEN, Service: 9.30.151.107:8020, Ident: (HDFS_DELEGATION_TOKEN token 38
for ambari-qa)
2016-03-24 21:21:12,753 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Kind:
TIMELINE_DELEGATION_TOKEN, Service: 9.30.151.107:8188, Ident: (owner=ambari-qa, renewer=yarn,
realUser=, issueDate=1458879665842, maxDate=1459484465842, sequenceNumber=32, masterKeyId=37)
2016-03-24 21:21:12,672 DEBUG [TGT Renewer for ambari-qa@XXX.COM] org.apache.hadoop.security.UserGroupInformation:
Found tgt Ticket (hex) = 
... ...
2016-03-24 21:21:13,728 DEBUG [main] org.kitesdk.data.spi.filesystem.FileSystemDatasetRepository:
Loading dataset: hhh555
2016-03-24 21:21:13,823 INFO [main] hive.metastore: Trying to connect to metastore with URI
thrift://xxx:9083
2016-03-24 21:21:13,830 DEBUG [main] org.apache.hadoop.security.UserGroupInformation: PrivilegedAction
as:ambari-qa (auth:SIMPLE) from:org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:403)
2016-03-24 21:21:13,858 DEBUG [main] org.apache.hadoop.security.UserGroupInformation: PrivilegedAction
as:ambari-qa (auth:SIMPLE) from:org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport.open(TUGIAssumingTransport.java:49)
2016-03-24 21:21:13,858 DEBUG [main] org.apache.thrift.transport.TSaslTransport: opening transport
org.apache.thrift.transport.TSaslClientTransport@25f7391e
2016-03-24 21:21:13,864 ERROR [main] org.apache.thrift.transport.TSaslTransport: SASL negotiation
failure
javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials
provided (Mechanism level: Failed to find any Kerberos tgt)]
    at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:211)
    at org.apache.thrift.transport.TSaslClientTransport.handleSaslStartMessage(TSaslClientTransport.java:94)
    at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:271)
    at org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:37)
    at org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:52)
    at org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:49)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
    at org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport.open(TUGIAssumingTransport.java:49)
    at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:432)
    at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:237)
    at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:182)
    at org.kitesdk.data.spi.hive.MetaStoreUtil.<init>(MetaStoreUtil.java:89)
    at org.kitesdk.data.spi.hive.HiveAbstractMetadataProvider.getMetaStoreUtil(HiveAbstractMetadataProvider.java:63)
    at org.kitesdk.data.spi.hive.HiveAbstractMetadataProvider.resolveNamespace(HiveAbstractMetadataProvider.java:270)
    at org.kitesdk.data.spi.hive.HiveAbstractMetadataProvider.resolveNamespace(HiveAbstractMetadataProvider.java:255)
    at org.kitesdk.data.spi.hive.HiveAbstractMetadataProvider.load(HiveAbstractMetadataProvider.java:102)
    at org.kitesdk.data.spi.filesystem.FileSystemDatasetRepository.load(FileSystemDatasetRepository.java:197)
    at org.kitesdk.data.Datasets.load(Datasets.java:108)
    at org.kitesdk.data.Datasets.load(Datasets.java:165)
    at org.kitesdk.data.mapreduce.DatasetKeyOutputFormat.load(DatasetKeyOutputFormat.java:542)
    at org.kitesdk.data.mapreduce.DatasetKeyOutputFormat.getOutputCommitter(DatasetKeyOutputFormat.java:505)
    at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$1.call(MRAppMaster.java:476)
    at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$1.call(MRAppMaster.java:458)
    at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.callWithJobClassLoader(MRAppMaster.java:1560)
    at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.createOutputCommitter(MRAppMaster.java:458)
    at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceInit(MRAppMaster.java:377)
    at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
    at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$4.run(MRAppMaster.java:1518)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
    at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1515)
    at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1448)
Caused by: GSSException: No valid credentials provided (Mechanism level: Failed to find any
Kerberos tgt)
    at sun.security.jgss.krb5.Krb5InitCredential.getInstance(Krb5InitCredential.java:147)
    at sun.security.jgss.krb5.Krb5MechFactory.getCredentialElement(Krb5MechFactory.java:122)
    at sun.security.jgss.krb5.Krb5MechFactory.getMechanismContext(Krb5MechFactory.java:187)
    at sun.security.jgss.GSSManagerImpl.getMechanismContext(GSSManagerImpl.java:224)
    at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:212)
    at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:179)
    at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:192)
    ... 34 more



> Hive import with Parquet failed in Kerberos enabled cluster
> -----------------------------------------------------------
>
>                 Key: SQOOP-2894
>                 URL: https://issues.apache.org/jira/browse/SQOOP-2894
>             Project: Sqoop
>          Issue Type: Bug
>          Components: tools
>    Affects Versions: 1.4.6
>         Environment: Redhat 6.6, Sqoop 1.4.6+Hadoop 2.7.2+Hive 1.2.1
>            Reporter: Ping Wang
>              Labels: security
>
> Importing data from external database to hive with Parquet option failed in the kerberos
environment. (It can success without kerberos). 

> The sqoop command I used:
> sqoop import --connect jdbc:db2://xxx:50000/testdb --username xxx --password xxx --table
users --hive-import -hive-table users3 --as-parquetfile -m 1
> The import job failed:

> ......
> 2016-02-26 04:20:07,020 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Using
mapred newApiCommitter.
> 2016-02-26 04:20:08,088 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: OutputCommitter
set in config null
> 2016-02-26 04:20:08,918 INFO [main] hive.metastore: Trying to connect to metastore with
URI thrift://xxx:9083
> 2016-02-26 04:30:09,207 WARN [main] hive.metastore: set_ugi() not successful, Likely
cause: new client talking to old server. Continuing without it.
> org.apache.thrift.transport.TTransportException: java.net.SocketTimeoutException: Read
timed out
>     at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:129)
>     at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86)
>     at org.apache.thrift.protocol.TBinaryProtocol.readStringBody(TBinaryProtocol.java:380)
>     at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:230)
>     at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69)
>     at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_set_ugi(ThriftHiveMetastore.java:3688)
>     at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.set_ugi(ThriftHiveMetastore.java:3674)
>     at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:448)
>     at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:237)
>     at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:182)
>     at org.kitesdk.data.spi.hive.MetaStoreUtil.<init>(MetaStoreUtil.java:82)
>     at org.kitesdk.data.spi.hive.HiveAbstractMetadataProvider.getMetaStoreUtil(HiveAbstractMetadataProvider.java:63)
>     at org.kitesdk.data.spi.hive.HiveAbstractMetadataProvider.resolveNamespace(HiveAbstractMetadataProvider.java:270)
>     at org.kitesdk.data.spi.hive.HiveAbstractMetadataProvider.resolveNamespace(HiveAbstractMetadataProvider.java:255)
>     at org.kitesdk.data.spi.hive.HiveAbstractMetadataProvider.load(HiveAbstractMetadataProvider.java:102)
>     at org.kitesdk.data.spi.filesystem.FileSystemDatasetRepository.load(FileSystemDatasetRepository.java:192)
>     at org.kitesdk.data.Datasets.load(Datasets.java:108)
>     at org.kitesdk.data.Datasets.load(Datasets.java:165)
>     at org.kitesdk.data.mapreduce.DatasetKeyOutputFormat.load(DatasetKeyOutputFormat.java:510)
>     at org.kitesdk.data.mapreduce.DatasetKeyOutputFormat.getOutputCommitter(DatasetKeyOutputFormat.java:473)
>     at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$1.call(MRAppMaster.java:476)
>     at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$1.call(MRAppMaster.java:458)
>     at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.callWithJobClassLoader(MRAppMaster.java:1560)
>     at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.createOutputCommitter(MRAppMaster.java:458)
>     at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceInit(MRAppMaster.java:377)
>     at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
>     at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$4.run(MRAppMaster.java:1518)
>     at java.security.AccessController.doPrivileged(Native Method)
>     at javax.security.auth.Subject.doAs(Subject.java:422)
>     at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
>     at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1515)
>     at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1448)
> ....... 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message