sqoop-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Szabolcs Vasas <vasas.szabo...@gmail.com>
Subject Re: Review Request 66361: Implement HiveServer2 client
Date Mon, 16 Apr 2018 09:17:26 GMT


> On April 13, 2018, 2:33 p.m., daniel voros wrote:
> > Hey Szabolcs,
> > 
> > I'm trying to run the latest patch on a (non-kerberized) cluster, but I get the
following:
> > 
> > ```
> > 18/04/13 13:50:47 INFO hive.HiveServer2ConnectionFactory: Creating connection to
HiveServer2 as: hdfs (auth:SIMPLE)
> > 18/04/13 13:50:47 INFO jdbc.Utils: Supplied authorities: hostname:10000
> > 18/04/13 13:50:47 INFO jdbc.Utils: Resolved authority: hostname:10000
> > 18/04/13 13:50:48 ERROR sqoop.Sqoop: Got exception running Sqoop: java.lang.RuntimeException:
Error executing Hive import.
> > java.lang.RuntimeException: Error executing Hive import.
> >         at org.apache.sqoop.hive.HiveServer2Client.executeHiveImport(HiveServer2Client.java:85)
> >         at org.apache.sqoop.hive.HiveServer2Client.importTable(HiveServer2Client.java:63)
> >         at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:547)
> >         at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:632)
> >         at org.apache.sqoop.Sqoop.run(Sqoop.java:145)
> >         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
> >         at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:181)
> >         at org.apache.sqoop.Sqoop.runTool(Sqoop.java:232)
> >         at org.apache.sqoop.Sqoop.runTool(Sqoop.java:241)
> >         at org.apache.sqoop.Sqoop.main(Sqoop.java:250)
> > 
> > ...
> > 
> > 
> > Caused by: org.apache.hadoop.hive.ql.parse.SemanticException: Line 1:17 Invalid
path ''hdfs://hostname:8020/user/hdfs/asd''
> >         at org.apache.hadoop.hive.ql.parse.LoadSemanticAnalyzer.applyConstraintsAndGetFiles(LoadSemanticAnalyzer.java:160)
> >         at org.apache.hadoop.hive.ql.parse.LoadSemanticAnalyzer.analyzeInternal(LoadSemanticAnalyzer.java:225)
> >         at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:238)
> >         at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:465)
> >         at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:321)
> >         at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1224)
> >         at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1218)
> >         at org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:146)
> >         ... 26 more
> > Caused by: org.apache.hadoop.security.AccessControlException: Permission denied:
user=anonymous, access=EXECUTE, inode="/user/hdfs/asd":hdfs:hdfs:drwx------
> >         at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:353)
> >         at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkTraverse(FSPermissionChecker.java:292)
> >         at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:238)
> >         at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:190)
> >         at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1950)
> >         at org.apache.hadoop.hdfs.server.namenode.FSDirStatAndListingOp.getFileInfo(FSDirStatAndListingOp.java:108)
> >         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getFileInfo(FSNamesystem.java:4142)
> > ...
> > ```
> > 
> > Note that according to the log message, we're running as 'hdfs' user (current OS
user), but HDFS checks permission for anonymous.
> > 
> > Could it be the result of org/apache/sqoop/hive/HiveServer2ConnectionFactory.java:42
(passing username=null)?
> > ```
> >   public HiveServer2ConnectionFactory(String connectionString) {
> >     this(connectionString, null, null);
> >   }
> > ```
> > 
> > Also, it might make sense to use the --hs2-user parameter in the non-kerberized
case as well. Like beeline allows you to override user with `-n username`. What do you think?
> > 
> > Regards,
> > Daniel
> 
> Szabolcs Vasas wrote:
>     Hi Dani,
>     
>     Thank you for checking this!
>     I was aiming for the kerberos authentication only so I have never tested it in non-kerberized
environment but it totally makes sense to fix this scenario. However in that case we should
add a --hs2-password parameter as well, shouldn't we? Or does Hive accept connections without
password specified?
> 
> daniel voros wrote:
>     Paswordless connections are also allowed. However, since beeline has a `-p <password>`,
I think it would make sense to add `--hs2-password`. Thank you!

I have added the fix to support password-less connections on non-secure clusters but I think
implementing a secure way to provide the password for HS2 is a bit more complex so I would
cover it in another JIRA since this one is already big enough.


- Szabolcs


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66361/#review201101
-----------------------------------------------------------


On April 16, 2018, 9:12 a.m., Szabolcs Vasas wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66361/
> -----------------------------------------------------------
> 
> (Updated April 16, 2018, 9:12 a.m.)
> 
> 
> Review request for Sqoop.
> 
> 
> Bugs: SQOOP-3309
>     https://issues.apache.org/jira/browse/SQOOP-3309
> 
> 
> Repository: sqoop-trunk
> 
> 
> Description
> -------
> 
> This JIRA covers the implementation of the client for HiveServer2 and its integration
into the classes which use HiveImport.
> 
> - HiveClient interface is introduced with 2 implementation:
>   - HiveImport: this is the original implementation which uses HiveCLI
>   - HiveServer2Client: the new clients which connects to HS2 using JDBC connection
>   - The common code is extracted to HiveCommon class
> - HiveClient should be instantiated using HiveClientFactory which creates and configures
the right HiveClient based on the configuration in SqoopOptions
> - HiveMiniCluster is introduced with a couple of helper classes to enable end-to-end
HS2 tests
> - A couple of new options are added to SqoopOptions to be able to configure the connection
to HS2
> - Validation is implemented for these new options
> 
> 
> Diffs
> -----
> 
>   build.xml 7f68b573c65a61150ca78d158084586c87775d84 
>   ivy.xml 6be4fa20fbbf1f303c69d86942b1874e18a14afc 
>   src/docs/user/hive-args.txt 441f54e8e0cee63595937f4e1811abc2d89f9237 
>   src/docs/user/hive.txt 3dc8bb463d602d525fe5f2d07d52cb97efcbab7e 
>   src/java/org/apache/sqoop/SqoopOptions.java 651cebd69ee7e75d06c75945e3607c4fab7eb11c

>   src/java/org/apache/sqoop/hive/HiveClient.java PRE-CREATION 
>   src/java/org/apache/sqoop/hive/HiveClientCommon.java PRE-CREATION 
>   src/java/org/apache/sqoop/hive/HiveClientFactory.java PRE-CREATION 
>   src/java/org/apache/sqoop/hive/HiveImport.java c2729119d31f7e585f204f2d31b2051eea71b72b

>   src/java/org/apache/sqoop/hive/HiveServer2Client.java PRE-CREATION 
>   src/java/org/apache/sqoop/hive/HiveServer2ConnectionFactory.java PRE-CREATION 
>   src/java/org/apache/sqoop/hive/TableDefWriter.java b7a25b7809e0d50166966a77161dc8ff603fb2d2

>   src/java/org/apache/sqoop/tool/BaseSqoopTool.java b02e4fe7fda25c7f8171c7db17d15a7987459687

>   src/java/org/apache/sqoop/tool/CreateHiveTableTool.java d259566180369a55d490144e6f865e728f4f2e61

>   src/java/org/apache/sqoop/tool/ImportAllTablesTool.java 18f7a0af48d972d5186e9414475e080f1eb765f3

>   src/java/org/apache/sqoop/tool/ImportTool.java e9920058858653bec7407bf7992eb6445401e813

>   src/test/org/apache/sqoop/hive/TestHiveClientFactory.java PRE-CREATION 
>   src/test/org/apache/sqoop/hive/TestHiveMiniCluster.java PRE-CREATION 
>   src/test/org/apache/sqoop/hive/TestHiveServer2Client.java PRE-CREATION 
>   src/test/org/apache/sqoop/hive/TestHiveServer2TextImport.java PRE-CREATION 
>   src/test/org/apache/sqoop/hive/TestTableDefWriter.java 8bdc3beb3677312ec0ee2e612616358bca4ca838

>   src/test/org/apache/sqoop/hive/minicluster/AuthenticationConfiguration.java PRE-CREATION

>   src/test/org/apache/sqoop/hive/minicluster/HiveMiniCluster.java PRE-CREATION 
>   src/test/org/apache/sqoop/hive/minicluster/KerberosAuthenticationConfiguration.java
PRE-CREATION 
>   src/test/org/apache/sqoop/hive/minicluster/NoAuthenticationConfiguration.java PRE-CREATION

>   src/test/org/apache/sqoop/hive/minicluster/PasswordAuthenticationConfiguration.java
PRE-CREATION 
>   src/test/org/apache/sqoop/testutil/HiveServer2TestUtil.java PRE-CREATION 
>   src/test/org/apache/sqoop/tool/TestHiveServer2OptionValidations.java PRE-CREATION 
>   src/test/org/apache/sqoop/tool/TestImportTool.java 1c0cf4d863692f75bb8831e834fae47fc18b5df5

> 
> 
> Diff: https://reviews.apache.org/r/66361/diff/5/
> 
> 
> Testing
> -------
> 
> Ran unit and third party tests suite.
> 
> 
> Thanks,
> 
> Szabolcs Vasas
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message