spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mich Talebzadeh <mich.talebza...@gmail.com>
Subject Re: Unable to write data into hive table using Spark via Hive JDBC driver Caused by: org.apache.hive.service.cli.HiveSQLException: Error while compiling statement: FAILED
Date Thu, 15 Jul 2021 16:25:17 GMT
Have you created that table in Hive or are you trying to create it from
Spark itself.

You Hive is local. In this case you don't need a JDBC connection. Have you
tried:

df2.write.mode("overwrite").saveAsTable(mydb.mytable)

HTH




   view my Linkedin profile
<https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>



*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.




On Thu, 15 Jul 2021 at 12:51, Badrinath Patchikolla <
pbadrinath1996@gmail.com> wrote:

> Hi,
>
> Trying to write data in spark to the hive as JDBC mode below  is the
> sample code:
>
> spark standalone 2.4.7 version
>
> 21/07/15 08:04:07 WARN util.NativeCodeLoader: Unable to load native-hadoop
> library for your platform... using builtin-java classes where applicable
> Setting default log level to "WARN".
> To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use
> setLogLevel(newLevel).
> Spark context Web UI available at http://localhost:4040
> Spark context available as 'sc' (master = spark://localhost:7077, app id =
> app-20210715080414-0817).
> Spark session available as 'spark'.
> Welcome to
>       ____              __
>      / __/__  ___ _____/ /__
>     _\ \/ _ \/ _ `/ __/  '_/
>    /___/ .__/\_,_/_/ /_/\_\   version 2.4.7
>       /_/
>
> Using Scala version 2.11.12 (OpenJDK 64-Bit Server VM, Java 1.8.0_292)
> Type in expressions to have them evaluated.
> Type :help for more information.
>
> scala> :paste
> // Entering paste mode (ctrl-D to finish)
>
> val df = Seq(
>     ("John", "Smith", "London"),
>     ("David", "Jones", "India"),
>     ("Michael", "Johnson", "Indonesia"),
>     ("Chris", "Lee", "Brazil"),
>     ("Mike", "Brown", "Russia")
>   ).toDF("first_name", "last_name", "country")
>
>
>  df.write
>   .format("jdbc")
>   .option("url",
> "jdbc:hive2://localhost:10000/foundation;AuthMech=2;UseNativeQuery=0")
>   .option("dbtable", "test.test")
>   .option("user", "admin")
>   .option("password", "admin")
>   .option("driver", "com.cloudera.hive.jdbc41.HS2Driver")
>   .mode("overwrite")
>   .save
>
>
> // Exiting paste mode, now interpreting.
>
> java.sql.SQLException: [Cloudera][HiveJDBCDriver](500051) ERROR processing
> query/statement. Error Code: 40000, SQL state:
> TStatus(statusCode:ERROR_STATUS,
> infoMessages:[*org.apache.hive.service.cli.HiveSQLException:Error while
> compiling statement: FAILED: ParseException line 1:39 cannot recognize
> input near '"first_name"' 'TEXT' ',' in column name or primary key or
> foreign key:28:27,
> org.apache.hive.service.cli.operation.Operation:toSQLException:Operation.java:329,
> org.apache.hive.service.cli.operation.SQLOperation:prepare:SQLOperation.java:207,
> org.apache.hive.service.cli.operation.SQLOperation:runInternal:SQLOperation.java:290,
> org.apache.hive.service.cli.operation.Operation:run:Operation.java:260,
> org.apache.hive.service.cli.session.HiveSessionImpl:executeStatementInternal:HiveSessionImpl.java:504,
> org.apache.hive.service.cli.session.HiveSessionImpl:executeStatementAsync:HiveSessionImpl.java:490,
> sun.reflect.GeneratedMethodAccessor13:invoke::-1,
> sun.reflect.DelegatingMethodAccessorImpl:invoke:DelegatingMethodAccessorImpl.java:43,
> java.lang.reflect.Method:invoke:Method.java:498,
> org.apache.hive.service.cli.session.HiveSessionProxy:invoke:HiveSessionProxy.java:78,
> org.apache.hive.service.cli.session.HiveSessionProxy:access$000:HiveSessionProxy.java:36,
> org.apache.hive.service.cli.session.HiveSessionProxy$1:run:HiveSessionProxy.java:63,
> java.security.AccessController:doPrivileged:AccessController.java:-2,
> javax.security.auth.Subject:doAs:Subject.java:422,
> org.apache.hadoop.security.UserGroupInformation:doAs:UserGroupInformation.java:1875,
> org.apache.hive.service.cli.session.HiveSessionProxy:invoke:HiveSessionProxy.java:59,
> com.sun.proxy.$Proxy35:executeStatementAsync::-1,
> org.apache.hive.service.cli.CLIService:executeStatementAsync:CLIService.java:295,
> org.apache.hive.service.cli.thrift.ThriftCLIService:ExecuteStatement:ThriftCLIService.java:507,
> org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement:getResult:TCLIService.java:1437,
> org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement:getResult:TCLIService.java:1422,
> org.apache.thrift.ProcessFunction:process:ProcessFunction.java:39,
> org.apache.thrift.TBaseProcessor:process:TBaseProcessor.java:39,
> org.apache.hive.service.auth.TSetIpAddressProcessor:process:TSetIpAddressProcessor.java:56,
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess:run:TThreadPoolServer.java:286,
> java.util.concurrent.ThreadPoolExecutor:runWorker:ThreadPoolExecutor.java:1149,
> java.util.concurrent.ThreadPoolExecutor$Worker:run:ThreadPoolExecutor.java:624,
> java.lang.Thread:run:Thread.java:748,
> *org.apache.hadoop.hive.ql.parse.ParseException:line 1:39 cannot recognize
> input near '"first_name"' 'TEXT' ',' in column name or primary key or
> foreign key:33:6,
> org.apache.hadoop.hive.ql.parse.ParseDriver:parse:ParseDriver.java:221,
> org.apache.hadoop.hive.ql.parse.ParseUtils:parse:ParseUtils.java:75,
> org.apache.hadoop.hive.ql.parse.ParseUtils:parse:ParseUtils.java:68,
> org.apache.hadoop.hive.ql.Driver:compile:Driver.java:564,
> org.apache.hadoop.hive.ql.Driver:compileInternal:Driver.java:1425,
> org.apache.hadoop.hive.ql.Driver:compileAndRespond:Driver.java:1398,
> org.apache.hive.service.cli.operation.SQLOperation:prepare:SQLOperation.java:205],
> sqlState:42000, errorCode:40000, errorMessage:Error while compiling
> statement: FAILED: ParseException line 1:39 cannot recognize input near
> '"first_name"' 'TEXT' ',' in column name or primary key or foreign key),
> Query: CREATE TABLE test.test("first_name" TEXT , "last_name" TEXT ,
> "country" TEXT ).
>   at
> com.cloudera.hiveserver2.hivecommon.api.HS2Client.executeStatementInternal(Unknown
> Source)
>   at
> com.cloudera.hiveserver2.hivecommon.api.HS2Client.executeStatement(Unknown
> Source)
>   at
> com.cloudera.hiveserver2.hivecommon.dataengine.HiveJDBCNativeQueryExecutor.executeHelper(Unknown
> Source)
>   at
> com.cloudera.hiveserver2.hivecommon.dataengine.HiveJDBCNativeQueryExecutor.execute(Unknown
> Source)
>   at
> com.cloudera.hiveserver2.jdbc.common.SStatement.executeNoParams(Unknown
> Source)
>   at
> com.cloudera.hiveserver2.jdbc.common.SStatement.executeAnyUpdate(Unknown
> Source)
>   at com.cloudera.hiveserver2.jdbc.common.SStatement.executeUpdate(Unknown
> Source)
>   at
> org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.createTable(JdbcUtils.scala:863)
>   at
> org.apache.spark.sql.execution.datasources.jdbc.JdbcRelationProvider.createRelation(JdbcRelationProvider.scala:81)
>   at
> org.apache.spark.sql.execution.datasources.SaveIntoDataSourceCommand.run(SaveIntoDataSourceCommand.scala:45)
>   at
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70)
>   at
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68)
>   at
> org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:86)
>   at
> org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:131)
>   at
> org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:127)
>   at
> org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:155)
>   at
> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
>   at
> org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:152)
>   at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:127)
>   at
> org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:83)
>   at
> org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:81)
>   at
> org.apache.spark.sql.DataFrameWriter$$anonfun$runCommand$1.apply(DataFrameWriter.scala:696)
>   at
> org.apache.spark.sql.DataFrameWriter$$anonfun$runCommand$1.apply(DataFrameWriter.scala:696)
>   at
> org.apache.spark.sql.execution.SQLExecution$$anonfun$withNewExecutionId$1.apply(SQLExecution.scala:80)
>   at
> org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:127)
>   at
> org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:75)
>   at
> org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:696)
>   at
> org.apache.spark.sql.DataFrameWriter.saveToV1Source(DataFrameWriter.scala:305)
>   at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:291)
>   ... 48 elided
> Caused by: com.cloudera.hiveserver2.support.exceptions.GeneralException:
> [Cloudera][HiveJDBCDriver](500051) ERROR processing query/statement. Error
> Code: 40000, SQL state: TStatus(statusCode:ERROR_STATUS,
> infoMessages:[*org.apache.hive.service.cli.HiveSQLException:Error while
> compiling statement: FAILED: ParseException line 1:39 cannot recognize
> input near '"first_name"' 'TEXT' ',' in column name or primary key or
> foreign key:28:27,
> org.apache.hive.service.cli.operation.Operation:toSQLException:Operation.java:329,
> org.apache.hive.service.cli.operation.SQLOperation:prepare:SQLOperation.java:207,
> org.apache.hive.service.cli.operation.SQLOperation:runInternal:SQLOperation.java:290,
> org.apache.hive.service.cli.operation.Operation:run:Operation.java:260,
> org.apache.hive.service.cli.session.HiveSessionImpl:executeStatementInternal:HiveSessionImpl.java:504,
> org.apache.hive.service.cli.session.HiveSessionImpl:executeStatementAsync:HiveSessionImpl.java:490,
> sun.reflect.GeneratedMethodAccessor13:invoke::-1,
> sun.reflect.DelegatingMethodAccessorImpl:invoke:DelegatingMethodAccessorImpl.java:43,
> java.lang.reflect.Method:invoke:Method.java:498,
> org.apache.hive.service.cli.session.HiveSessionProxy:invoke:HiveSessionProxy.java:78,
> org.apache.hive.service.cli.session.HiveSessionProxy:access$000:HiveSessionProxy.java:36,
> org.apache.hive.service.cli.session.HiveSessionProxy$1:run:HiveSessionProxy.java:63,
> java.security.AccessController:doPrivileged:AccessController.java:-2,
> javax.security.auth.Subject:doAs:Subject.java:422,
> org.apache.hadoop.security.UserGroupInformation:doAs:UserGroupInformation.java:1875,
> org.apache.hive.service.cli.session.HiveSessionProxy:invoke:HiveSessionProxy.java:59,
> com.sun.proxy.$Proxy35:executeStatementAsync::-1,
> org.apache.hive.service.cli.CLIService:executeStatementAsync:CLIService.java:295,
> org.apache.hive.service.cli.thrift.ThriftCLIService:ExecuteStatement:ThriftCLIService.java:507,
> org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement:getResult:TCLIService.java:1437,
> org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement:getResult:TCLIService.java:1422,
> org.apache.thrift.ProcessFunction:process:ProcessFunction.java:39,
> org.apache.thrift.TBaseProcessor:process:TBaseProcessor.java:39,
> org.apache.hive.service.auth.TSetIpAddressProcessor:process:TSetIpAddressProcessor.java:56,
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess:run:TThreadPoolServer.java:286,
> java.util.concurrent.ThreadPoolExecutor:runWorker:ThreadPoolExecutor.java:1149,
> java.util.concurrent.ThreadPoolExecutor$Worker:run:ThreadPoolExecutor.java:624,
> java.lang.Thread:run:Thread.java:748,
> *org.apache.hadoop.hive.ql.parse.ParseException:line 1:39 cannot recognize
> input near '"first_name"' 'TEXT' ',' in column name or primary key or
> foreign key:33:6,
> org.apache.hadoop.hive.ql.parse.ParseDriver:parse:ParseDriver.java:221,
> org.apache.hadoop.hive.ql.parse.ParseUtils:parse:ParseUtils.java:75,
> org.apache.hadoop.hive.ql.parse.ParseUtils:parse:ParseUtils.java:68,
> org.apache.hadoop.hive.ql.Driver:compile:Driver.java:564,
> org.apache.hadoop.hive.ql.Driver:compileInternal:Driver.java:1425,
> org.apache.hadoop.hive.ql.Driver:compileAndRespond:Driver.java:1398,
> org.apache.hive.service.cli.operation.SQLOperation:prepare:SQLOperation.java:205],
> sqlState:42000, errorCode:40000, errorMessage:Error while compiling
> statement: FAILED: ParseException line 1:39 cannot recognize input near
> '"first_name"' 'TEXT' ',' in column name or primary key or foreign key),
> Query: CREATE TABLE profile_test.person_test ("first_name" TEXT ,
> "last_name" TEXT , "country" TEXT ).
>   ... 77 more
>
>
> Found similar issue ion Jira :
>
> https://issues.apache.org/jira/browse/SPARK-31614
>
> There no comments in that and the resolution is Incomplete, is there any
> way we can do in spark to write data into the hive as JDBC mode.
>
> Thanks for any help.
>
>
> Thanks,
> Badrinath.
>

Mime
View raw message