sqoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mario Amatucci <mamatu...@gmail.com>
Subject Re: Sqoop with Hcat integration on AWS EMR with AWS Glue Data Catalog
Date Thu, 01 Mar 2018 07:56:51 GMT
hi
no idea but as a test try to add some data manually to the table try to use
sqoop export, if that does not work too likely it is about db and table
permission on r/w

sent from honor8

On 22 Feb 2018 22:24, "Greg Lindholm" <greg.lindholm@gmail.com> wrote:

> Has anyone managed to get Sqoop with hcatalog integration working on AWS
> EMR when Hive is configured to use AWS Glue Data Catalog?
>
> I'm attempting to import from a MySQL db into Hive on an AWS EMR cluster.
> Hive is configured to use AWS Glue Data Catalog as the metadata catalog.
>
> sqoop import \
>   -Dmapred.output.direct.NativeS3FileSystem=false \
>   -Dmapred.output.direct.EmrFileSystem=false \
>   --connect jdbc:mysql://ec2-18-221-214-250.us-east-2.compute.
> amazonaws.com:3306/test1 \
>   --username XXX -P \
>   -m 1 \
>   --table sampledata1 \
>   --hcatalog-database greg5 \
>   --hcatalog-table sampledata1_orc1 \
>   --create-hcatalog-table \
>   --hcatalog-storage-stanza 'stored as orc'
>
> It appears that the EMR setup wizard properly configures Hive to use the
> Glue Data Catalog but not Sqoop.
>
> I had to add the Glue jar to Sqoop:
> sudo ln -s /usr/share/aws/hmclient/lib/aws-glue-datacatalog-hive2-client.jar
> /usr/lib/sqoop/lib/aws-glue-datacatalog-hive2-client.jar
>
> When I run the above Sqoop command the table gets created but the import
> then fails with and exception saying it can't find the table.
>
> I've checked in Glue (and Hive) and the table is created correctly.
>
> Here is the exception:
> 18/02/21 20:17:41 INFO conf.HiveConf: Found configuration file
> file:/etc/hive/conf.dist/hive-site.xml
> 18/02/21 20:17:42 INFO common.HiveClientCache: Initializing cache:
> eviction-timeout=120 initial-capacity=50 maximum-capacity=50
> 18/02/21 20:17:42 INFO hive.metastore: Trying to connect to metastore with
> URI thrift://ip-172-31-27-114.us-east-2.compute.internal:9083
> 18/02/21 20:17:42 INFO hive.metastore: Opened a connection to metastore,
> current connections: 1
> 18/02/21 20:17:42 INFO hive.metastore: Connected to metastore.
> 18/02/21 20:17:43 ERROR tool.ImportTool: Encountered IOException running
> import job: java.io.IOException: NoSuchObjectException(message:greg5.sampledata1_orc1
> table not found)
>         at org.apache.hive.hcatalog.mapreduce.HCatInputFormat.
> setInput(HCatInputFormat.java:97)
>         at org.apache.hive.hcatalog.mapreduce.HCatInputFormat.
> setInput(HCatInputFormat.java:51)
>         at org.apache.sqoop.mapreduce.hcat.SqoopHCatUtilities.
> configureHCat(SqoopHCatUtilities.java:343)
>         at org.apache.sqoop.mapreduce.hcat.SqoopHCatUtilities.
> configureImportOutputFormat(SqoopHCatUtilities.java:783)
>         at org.apache.sqoop.mapreduce.ImportJobBase.configureOutputFormat(
> ImportJobBase.java:98)
>         at org.apache.sqoop.mapreduce.ImportJobBase.runImport(
> ImportJobBase.java:259)
>         at org.apache.sqoop.manager.SqlManager.importTable(
> SqlManager.java:673)
>         at org.apache.sqoop.manager.MySQLManager.importTable(
> MySQLManager.java:118)
>         at org.apache.sqoop.tool.ImportTool.importTable(
> ImportTool.java:497)
>         at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:605)
>         at org.apache.sqoop.Sqoop.run(Sqoop.java:143)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>         at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:179)
>         at org.apache.sqoop.Sqoop.runTool(Sqoop.java:218)
>         at org.apache.sqoop.Sqoop.runTool(Sqoop.java:227)
>         at org.apache.sqoop.Sqoop.main(Sqoop.java:236)
> Caused by: NoSuchObjectException(message:greg5.sampledata1_orc1 table not
> found)
>         at org.apache.hadoop.hive.metastore.api.
> ThriftHiveMetastore$get_table_req_result$get_table_req_
> resultStandardScheme.read(ThriftHiveMetastore.java:55064)
>         at org.apache.hadoop.hive.metastore.api.
> ThriftHiveMetastore$get_table_req_result$get_table_req_
> resultStandardScheme.read(ThriftHiveMetastore.java:55032)
>         at org.apache.hadoop.hive.metastore.api.
> ThriftHiveMetastore$get_table_req_result.read(ThriftHiveMetastore.java:
> 54963)
>         at org.apache.thrift.TServiceClient.receiveBase(
> TServiceClient.java:86)
>         at org.apache.hadoop.hive.metastore.api.
> ThriftHiveMetastore$Client.recv_get_table_req(
> ThriftHiveMetastore.java:1563)
>         at org.apache.hadoop.hive.metastore.api.
> ThriftHiveMetastore$Client.get_table_req(ThriftHiveMetastore.java:1550)
>         at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.
> getTable(HiveMetaStoreClient.java:1344)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at sun.reflect.NativeMethodAccessorImpl.invoke(
> NativeMethodAccessorImpl.java:62)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(
> DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:498)
>         at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.
> invoke(RetryingMetaStoreClient.java:169)
>         at com.sun.proxy.$Proxy5.getTable(Unknown Source)
>         at org.apache.hive.hcatalog.common.HCatUtil.getTable(
> HCatUtil.java:180)
>         at org.apache.hive.hcatalog.mapreduce.InitializeInput.
> getInputJobInfo(InitializeInput.java:105)
>         at org.apache.hive.hcatalog.mapreduce.InitializeInput.
> setInput(InitializeInput.java:88)
>         at org.apache.hive.hcatalog.mapreduce.HCatInputFormat.
> setInput(HCatInputFormat.java:95)
>         ... 15 more
>
> The Hive config file has this property:
> /etc/hive/conf.dist/hive-site.xml
>
> <property>
>   <name>hive.metastore.client.factory.class</name>
>   <value>com.amazonaws.glue.catalog.metastore.
> AWSGlueDataCatalogHiveClientFactory</value>
> </property>
>
> Does anyone have any suggestions?
>
> /Greg
>

Mime
View raw message