incubator-hcatalog-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Manish Malhotra (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HCATALOG-541) The meta store client throws TimeOut exception if ~1000 clients are trying to call listPartition on the server
Date Wed, 21 Jan 2015 17:27:34 GMT

    [ https://issues.apache.org/jira/browse/HCATALOG-541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14285924#comment-14285924
] 

Manish Malhotra commented on HCATALOG-541:
------------------------------------------

Hi Travis and Arup,

I'm also facing similar problem while using Hive Thrift Server but without HCatalog. 
But I didnt see OOM error in the thrift server logs. 

Pattern is mostly when the load on the Hive thrift server is high ( mostly when most of the
Hive ETL jobs are running) some time it start getting into the mode where it doesnt respond
in time and throws Socket Timeout. 

And this happens for different operations and not only for list partitions.

Please update, if there is any update on this ticket, that might help my situation as well.


Regards,
Manish

Stack Trace: 

 at org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:378)
    at org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:297)
    at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:204)
    at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69)
    at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_get_database(ThriftHiveMetastore.java:412)
    at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_database(ThriftHiveMetastore.java:399)
    at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getDatabase(HiveMetaStoreClient.java:736)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:601)
    at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:74)
    at $Proxy7.getDatabase(Unknown Source)
    at org.apache.hadoop.hive.ql.metadata.Hive.getDatabase(Hive.java:1110)
    at org.apache.hadoop.hive.ql.metadata.Hive.databaseExists(Hive.java:1099)
    at org.apache.hadoop.hive.ql.exec.DDLTask.showTables(DDLTask.java:2206)
    at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:334)
    at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:138)
    at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)
    at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1336)
    at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1122)
    at org.apache.hadoop.hive.ql.Driver.run(Driver.java:935)
    at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:259)
    at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:216)
    at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:412)
    at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:347)
    at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:706)
    at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:613)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:601)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:160)
Caused by: java.net.SocketTimeoutException: Read timed out
    at java.net.SocketInputStream.socketRead0(Native Method)
    at java.net.SocketInputStream.read(SocketInputStream.java:150)
    at java.net.SocketInputStream.read(SocketInputStream.java:121)
    at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
    at java.io.BufferedInputStream.read1(BufferedInputStream.java:275)
    at java.io.BufferedInputStream.read(BufferedInputStream.java:334)
    at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127)
    ... 34 more
2015-01-20 22:44:12,978 ERROR exec.Task (SessionState.java:printError(401)) - FAILED: Error
in metadata: org.apache.thrift.transport.TTransportException: java.net.SocketTimeoutException:
Read timed out
org.apache.hadoop.hive.ql.metadata.HiveException: org.apache.thrift.transport.TTransportException:
java.net.SocketTimeoutException: Read timed out
    at org.apache.hadoop.hive.ql.metadata.Hive.getDatabase(Hive.java:1114)
    at org.apache.hadoop.hive.ql.metadata.Hive.databaseExists(Hive.java:1099)
    at org.apache.hadoop.hive.ql.exec.DDLTask.showTables(DDLTask.java:2206)
    at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:334)
    at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:138)
    at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)
    at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1336)
    at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1122)
    at org.apache.hadoop.hive.ql.Driver.run(Driver.java:935)
    at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:259)
    at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:216)


> The meta store client throws TimeOut exception if ~1000 clients are trying to call listPartition
on the server
> --------------------------------------------------------------------------------------------------------------
>
>                 Key: HCATALOG-541
>                 URL: https://issues.apache.org/jira/browse/HCATALOG-541
>             Project: HCatalog
>          Issue Type: Improvement
>         Environment: Hadoop 0.23.4
> Hcatalog 0.4
> Oracle
>            Reporter: Arup Malakar
>
> Error on the client:
> {code}
> 2012-10-24 21:44:03,942 INFO [pool-12-thread-2] org.apache.hcatalog.hcatmix.load.tasks.Task:
Error listing partitions
> org.apache.thrift.transport.TTransportException: java.net.SocketTimeoutException: Read
timed out
>         at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:129)
>         at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
>         at org.apache.thrift.transport.TSaslTransport.readLength(TSaslTransport.java:345)
>         at org.apache.thrift.transport.TSaslTransport.readFrame(TSaslTransport.java:422)
>         at org.apache.thrift.transport.TSaslTransport.read(TSaslTransport.java:404)
>         at org.apache.thrift.transport.TSaslClientTransport.read(TSaslClientTransport.java:37)
>         at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
>         at org.apache.hadoop.hive.thrift.TFilterTransport.readAll(TFilterTransport.java:62)
>         at org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:378)
>         at org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:297)
>         at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:204)
>         at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69)
>         at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_get_partitions(ThriftHiveMetastore.java:1208)
>         at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_partitions(ThriftHiveMetastore.java:1193)
>         at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.listPartitions(HiveMetaStoreClient.java:631)
>         at org.apache.hcatalog.hcatmix.load.tasks.HCatListPartitionTask.doTask(HCatListPartitionTask.java:45)
>         at org.apache.hcatalog.hcatmix.load.TaskExecutor.call(TaskExecutor.java:79)
>         at org.apache.hcatalog.hcatmix.load.TaskExecutor.call(TaskExecutor.java:39)
>         at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>         at java.lang.Thread.run(Thread.java:619)
> Caused by: java.net.SocketTimeoutException: Read timed out
>         at java.net.SocketInputStream.socketRead0(Native Method)
>         at java.net.SocketInputStream.read(SocketInputStream.java:129)         at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127)
> {code}
> Error on the server:
> {code}
> Exception in thread "pool-1-thread-3206" java.lang.OutOfMemoryError: unable to create
new native thread
>         at java.lang.Thread.start0(Native Method)
>         at java.lang.Thread.start(Thread.java:597)
>         at org.datanucleus.store.query.Query.performExecuteTask(Query.java:1891)
>         at org.datanucleus.store.rdbms.query.JDOQLQuery.performExecute(JDOQLQuery.java:613)
>         at org.datanucleus.store.query.Query.executeQuery(Query.java:1692)
>         at org.datanucleus.store.query.Query.executeWithArray(Query.java:1527)
>         at org.datanucleus.jdo.JDOQuery.execute(JDOQuery.java:266)
>         at org.apache.hadoop.hive.metastore.ObjectStore.listMPartitions(ObjectStore.java:1521)
>         at org.apache.hadoop.hive.metastore.ObjectStore.getPartitions(ObjectStore.java:1268)
>         at sun.reflect.GeneratedMethodAccessor12.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:597)
>         at org.apache.hadoop.hive.metastore.RetryingRawStore.invoke(RetryingRawStore.java:111)
>         at $Proxy7.getPartitions(Unknown Source)
>         at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_partitions(HiveMetaStore.java:1468)
>         at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$get_partitions.getResult(ThriftHiveMetastore.java:5318)
>         at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$get_partitions.getResult(ThriftHiveMetastore.java:5306)
>         at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:32)
>         at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:34)
>         at org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge20S$Server$TUGIAssumingProcessor$1.run(HadoopThriftAuthBridge20S.java:555)
>         at org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge20S$Server$TUGIAssumingProcessor$1.run(HadoopThriftAuthBridge20S.java:552)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:396)
>         at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1212)
>         at org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge20S$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge20S.java:552)
>         at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run_aroundBody0(TThreadPoolServer.java:176)
>         at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run_aroundBody1$advice(TThreadPoolServer.java:101)
>         at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:1)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>         at java.lang.Thread.run(Thread.java:619)
> {code}
> The graph for concurrent usage of list partition can be seen here:
> https://cwiki.apache.org/confluence/download/attachments/30740331/hcatmix_list_partition_loadtest_25min.html
> The table has 2000 partitions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message