spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From 谭成灶 <tanx...@live.cn>
Subject insert into a partition table take a long time
Date Thu, 28 Apr 2016 04:46:44 GMT

 Hello Sir/Madam
   I want to insert into a partition table using dynamic partition (about 300G ,dst table
created in a orc format), 
   but in stage "get_partition_with_auth" take a long time ,
   while I  have set 

hive.exec.dynamic.partition=true 

hive.exec.dynamic.partition.mode="nonstrict"
   
   The following is my environment:
   hadoop2.5.0CDH5.2.1
   hive 0.13.1
   spark-1.6.1-bin-2.5.0-cdh5.2.1(I have recompiled,but hive.version=1.2.1 )
   
   I found a issue: https://issues.apache.org/jira/browse/SPARK-11785
      When deployed against remote Hive metastore, execution Hive client points to the actual
Hive metastore rather than local execution Derby metastore using Hive 1.2.1 libraries delivered
together with Spark (SPARK-11783).
    JDBC calls are not properly dispatched to metastore Hive client in Thrift server, but
handled by execution Hive. (SPARK-9686).
    When a JDBC call like getSchemas() comes, execution Hive client using a higher version
(1.2.1) is used to talk to a lower version Hive metastore (0.13.1). Because of incompatible
changes made between these two versions, the Thrift RPC call fails and exceptions are thrown.
          
   when I run bin/spark-sql ,here is info:
   16/04/28 11:08:59 INFO metastore.MetaStoreDirectSql: Using direct SQL, underlying DB is
DERBY
   16/04/28 11:08:59 INFO metastore.ObjectStore: Initialized ObjectStore
   16/04/28 11:08:59 WARN metastore.ObjectStore: Version information not found in metastore.
hive.metastore.schema.verification is not enabled so recording the schema version 1.2.0
   16/04/28 11:08:59 WARN metastore.ObjectStore: Failed to get database default, returning
NoSuchObjectException
   16/04/28 11:08:59 INFO metastore.HiveMetaStore: Added admin role in metastore
   16/04/28 11:08:59 INFO metastore.HiveMetaStore: Added public role in metastore
   16/04/28 11:09:00 INFO metastore.HiveMetaStore: No user is added in admin role, since config
is empty
   16/04/28 11:09:00 INFO metastore.HiveMetaStore: 0: get_all_databases
   16/04/28 11:09:00 INFO HiveMetaStore.audit: ugi=ocdc    ip=unknown-ip-addr      cmd=get_all_databases
   16/04/28 11:09:00 INFO metastore.HiveMetaStore: 0: get_functions: db=default pat=*
   16/04/28 11:09:00 INFO HiveMetaStore.audit: ugi=ocdc    ip=unknown-ip-addr      cmd=get_functions:
db=default pat=*
   
   
    So can you suggest me the any optimized way ,or may I have to upgrate hadoop and hive
version ?
    
 Thanks
Mime
View raw message