spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mich Talebzadeh <mich.talebza...@gmail.com>
Subject Re: Unable to write data into hive table using Spark via Hive JDBC driver Caused by: org.apache.hive.service.cli.HiveSQLException: Error while compiling statement: FAILED
Date Mon, 19 Jul 2021 07:17:03 GMT
Your Driver seems to be OK.

 hive_driver: com.cloudera.hive.jdbc41.HS2Driver

However this is theSQL error you are getting

Caused by: com.cloudera.hiveserver2.support.exceptions.GeneralException:
[Cloudera][HiveJDBCDriver](500051) ERROR processing query/statement. Error
Code: 40000, SQL state: TStatus(statusCode:ERROR_STATUS,
infoMessages:[*org.apache.hive.service.cli.HiveSQLException:Error while
compiling statement: FAILED: ParseException line 1:39 cannot recognize
input near '"first_name"' 'TEXT' ',' in column name or primary key or
foreign key:28


Are you using a reserved word for table columns? What is your DDL for this
table?


HTH



   view my Linkedin profile
<https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>



*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.




On Mon, 19 Jul 2021 at 07:42, Badrinath Patchikolla <
pbadrinath1996@gmail.com> wrote:

> I have trying to create table in hive from spark itself,
>
> And using local mode it will work what I am trying here is from spark
> standalone I want to create the manage table in hive (another spark cluster
> basically CDH) using jdbc mode.
>
> When I try that below are the error I am facing.
>
> On Thu, 15 Jul, 2021, 9:55 pm Mich Talebzadeh, <mich.talebzadeh@gmail.com>
> wrote:
>
>> Have you created that table in Hive or are you trying to create it from
>> Spark itself.
>>
>> You Hive is local. In this case you don't need a JDBC connection. Have
>> you tried:
>>
>> df2.write.mode("overwrite").saveAsTable(mydb.mytable)
>>
>> HTH
>>
>>
>>
>>
>>    view my Linkedin profile
>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>>
>>
>>
>> *Disclaimer:* Use it at your own risk. Any and all responsibility for
>> any loss, damage or destruction of data or any other property which may
>> arise from relying on this email's technical content is explicitly
>> disclaimed. The author will in no case be liable for any monetary damages
>> arising from such loss, damage or destruction.
>>
>>
>>
>>
>> On Thu, 15 Jul 2021 at 12:51, Badrinath Patchikolla <
>> pbadrinath1996@gmail.com> wrote:
>>
>>> Hi,
>>>
>>> Trying to write data in spark to the hive as JDBC mode below  is the
>>> sample code:
>>>
>>> spark standalone 2.4.7 version
>>>
>>> 21/07/15 08:04:07 WARN util.NativeCodeLoader: Unable to load
>>> native-hadoop library for your platform... using builtin-java classes where
>>> applicable
>>> Setting default log level to "WARN".
>>> To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use
>>> setLogLevel(newLevel).
>>> Spark context Web UI available at http://localhost:4040
>>> Spark context available as 'sc' (master = spark://localhost:7077, app id
>>> = app-20210715080414-0817).
>>> Spark session available as 'spark'.
>>> Welcome to
>>>       ____              __
>>>      / __/__  ___ _____/ /__
>>>     _\ \/ _ \/ _ `/ __/  '_/
>>>    /___/ .__/\_,_/_/ /_/\_\   version 2.4.7
>>>       /_/
>>>
>>> Using Scala version 2.11.12 (OpenJDK 64-Bit Server VM, Java 1.8.0_292)
>>> Type in expressions to have them evaluated.
>>> Type :help for more information.
>>>
>>> scala> :paste
>>> // Entering paste mode (ctrl-D to finish)
>>>
>>> val df = Seq(
>>>     ("John", "Smith", "London"),
>>>     ("David", "Jones", "India"),
>>>     ("Michael", "Johnson", "Indonesia"),
>>>     ("Chris", "Lee", "Brazil"),
>>>     ("Mike", "Brown", "Russia")
>>>   ).toDF("first_name", "last_name", "country")
>>>
>>>
>>>  df.write
>>>   .format("jdbc")
>>>   .option("url",
>>> "jdbc:hive2://localhost:10000/foundation;AuthMech=2;UseNativeQuery=0")
>>>   .option("dbtable", "test.test")
>>>   .option("user", "admin")
>>>   .option("password", "admin")
>>>   .option("driver", "com.cloudera.hive.jdbc41.HS2Driver")
>>>   .mode("overwrite")
>>>   .save
>>>
>>>
>>> // Exiting paste mode, now interpreting.
>>>
>>> java.sql.SQLException: [Cloudera][HiveJDBCDriver](500051) ERROR
>>> processing query/statement. Error Code: 40000, SQL state:
>>> TStatus(statusCode:ERROR_STATUS,
>>> infoMessages:[*org.apache.hive.service.cli.HiveSQLException:Error while
>>> compiling statement: FAILED: ParseException line 1:39 cannot recognize
>>> input near '"first_name"' 'TEXT' ',' in column name or primary key or
>>> foreign key:28:27,
>>> org.apache.hive.service.cli.operation.Operation:toSQLException:Operation.java:329,
>>> org.apache.hive.service.cli.operation.SQLOperation:prepare:SQLOperation.java:207,
>>> org.apache.hive.service.cli.operation.SQLOperation:runInternal:SQLOperation.java:290,
>>> org.apache.hive.service.cli.operation.Operation:run:Operation.java:260,
>>> org.apache.hive.service.cli.session.HiveSessionImpl:executeStatementInternal:HiveSessionImpl.java:504,
>>> org.apache.hive.service.cli.session.HiveSessionImpl:executeStatementAsync:HiveSessionImpl.java:490,
>>> sun.reflect.GeneratedMethodAccessor13:invoke::-1,
>>> sun.reflect.DelegatingMethodAccessorImpl:invoke:DelegatingMethodAccessorImpl.java:43,
>>> java.lang.reflect.Method:invoke:Method.java:498,
>>> org.apache.hive.service.cli.session.HiveSessionProxy:invoke:HiveSessionProxy.java:78,
>>> org.apache.hive.service.cli.session.HiveSessionProxy:access$000:HiveSessionProxy.java:36,
>>> org.apache.hive.service.cli.session.HiveSessionProxy$1:run:HiveSessionProxy.java:63,
>>> java.security.AccessController:doPrivileged:AccessController.java:-2,
>>> javax.security.auth.Subject:doAs:Subject.java:422,
>>> org.apache.hadoop.security.UserGroupInformation:doAs:UserGroupInformation.java:1875,
>>> org.apache.hive.service.cli.session.HiveSessionProxy:invoke:HiveSessionProxy.java:59,
>>> com.sun.proxy.$Proxy35:executeStatementAsync::-1,
>>> org.apache.hive.service.cli.CLIService:executeStatementAsync:CLIService.java:295,
>>> org.apache.hive.service.cli.thrift.ThriftCLIService:ExecuteStatement:ThriftCLIService.java:507,
>>> org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement:getResult:TCLIService.java:1437,
>>> org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement:getResult:TCLIService.java:1422,
>>> org.apache.thrift.ProcessFunction:process:ProcessFunction.java:39,
>>> org.apache.thrift.TBaseProcessor:process:TBaseProcessor.java:39,
>>> org.apache.hive.service.auth.TSetIpAddressProcessor:process:TSetIpAddressProcessor.java:56,
>>> org.apache.thrift.server.TThreadPoolServer$WorkerProcess:run:TThreadPoolServer.java:286,
>>> java.util.concurrent.ThreadPoolExecutor:runWorker:ThreadPoolExecutor.java:1149,
>>> java.util.concurrent.ThreadPoolExecutor$Worker:run:ThreadPoolExecutor.java:624,
>>> java.lang.Thread:run:Thread.java:748,
>>> *org.apache.hadoop.hive.ql.parse.ParseException:line 1:39 cannot recognize
>>> input near '"first_name"' 'TEXT' ',' in column name or primary key or
>>> foreign key:33:6,
>>> org.apache.hadoop.hive.ql.parse.ParseDriver:parse:ParseDriver.java:221,
>>> org.apache.hadoop.hive.ql.parse.ParseUtils:parse:ParseUtils.java:75,
>>> org.apache.hadoop.hive.ql.parse.ParseUtils:parse:ParseUtils.java:68,
>>> org.apache.hadoop.hive.ql.Driver:compile:Driver.java:564,
>>> org.apache.hadoop.hive.ql.Driver:compileInternal:Driver.java:1425,
>>> org.apache.hadoop.hive.ql.Driver:compileAndRespond:Driver.java:1398,
>>> org.apache.hive.service.cli.operation.SQLOperation:prepare:SQLOperation.java:205],
>>> sqlState:42000, errorCode:40000, errorMessage:Error while compiling
>>> statement: FAILED: ParseException line 1:39 cannot recognize input near
>>> '"first_name"' 'TEXT' ',' in column name or primary key or foreign key),
>>> Query: CREATE TABLE test.test("first_name" TEXT , "last_name" TEXT ,
>>> "country" TEXT ).
>>>   at
>>> com.cloudera.hiveserver2.hivecommon.api.HS2Client.executeStatementInternal(Unknown
>>> Source)
>>>   at
>>> com.cloudera.hiveserver2.hivecommon.api.HS2Client.executeStatement(Unknown
>>> Source)
>>>   at
>>> com.cloudera.hiveserver2.hivecommon.dataengine.HiveJDBCNativeQueryExecutor.executeHelper(Unknown
>>> Source)
>>>   at
>>> com.cloudera.hiveserver2.hivecommon.dataengine.HiveJDBCNativeQueryExecutor.execute(Unknown
>>> Source)
>>>   at
>>> com.cloudera.hiveserver2.jdbc.common.SStatement.executeNoParams(Unknown
>>> Source)
>>>   at
>>> com.cloudera.hiveserver2.jdbc.common.SStatement.executeAnyUpdate(Unknown
>>> Source)
>>>   at
>>> com.cloudera.hiveserver2.jdbc.common.SStatement.executeUpdate(Unknown
>>> Source)
>>>   at
>>> org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.createTable(JdbcUtils.scala:863)
>>>   at
>>> org.apache.spark.sql.execution.datasources.jdbc.JdbcRelationProvider.createRelation(JdbcRelationProvider.scala:81)
>>>   at
>>> org.apache.spark.sql.execution.datasources.SaveIntoDataSourceCommand.run(SaveIntoDataSourceCommand.scala:45)
>>>   at
>>> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70)
>>>   at
>>> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68)
>>>   at
>>> org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:86)
>>>   at
>>> org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:131)
>>>   at
>>> org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:127)
>>>   at
>>> org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:155)
>>>   at
>>> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
>>>   at
>>> org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:152)
>>>   at
>>> org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:127)
>>>   at
>>> org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:83)
>>>   at
>>> org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:81)
>>>   at
>>> org.apache.spark.sql.DataFrameWriter$$anonfun$runCommand$1.apply(DataFrameWriter.scala:696)
>>>   at
>>> org.apache.spark.sql.DataFrameWriter$$anonfun$runCommand$1.apply(DataFrameWriter.scala:696)
>>>   at
>>> org.apache.spark.sql.execution.SQLExecution$$anonfun$withNewExecutionId$1.apply(SQLExecution.scala:80)
>>>   at
>>> org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:127)
>>>   at
>>> org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:75)
>>>   at
>>> org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:696)
>>>   at
>>> org.apache.spark.sql.DataFrameWriter.saveToV1Source(DataFrameWriter.scala:305)
>>>   at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:291)
>>>   ... 48 elided
>>> Caused by: com.cloudera.hiveserver2.support.exceptions.GeneralException:
>>> [Cloudera][HiveJDBCDriver](500051) ERROR processing query/statement. Error
>>> Code: 40000, SQL state: TStatus(statusCode:ERROR_STATUS,
>>> infoMessages:[*org.apache.hive.service.cli.HiveSQLException:Error while
>>> compiling statement: FAILED: ParseException line 1:39 cannot recognize
>>> input near '"first_name"' 'TEXT' ',' in column name or primary key or
>>> foreign key:28:27,
>>> org.apache.hive.service.cli.operation.Operation:toSQLException:Operation.java:329,
>>> org.apache.hive.service.cli.operation.SQLOperation:prepare:SQLOperation.java:207,
>>> org.apache.hive.service.cli.operation.SQLOperation:runInternal:SQLOperation.java:290,
>>> org.apache.hive.service.cli.operation.Operation:run:Operation.java:260,
>>> org.apache.hive.service.cli.session.HiveSessionImpl:executeStatementInternal:HiveSessionImpl.java:504,
>>> org.apache.hive.service.cli.session.HiveSessionImpl:executeStatementAsync:HiveSessionImpl.java:490,
>>> sun.reflect.GeneratedMethodAccessor13:invoke::-1,
>>> sun.reflect.DelegatingMethodAccessorImpl:invoke:DelegatingMethodAccessorImpl.java:43,
>>> java.lang.reflect.Method:invoke:Method.java:498,
>>> org.apache.hive.service.cli.session.HiveSessionProxy:invoke:HiveSessionProxy.java:78,
>>> org.apache.hive.service.cli.session.HiveSessionProxy:access$000:HiveSessionProxy.java:36,
>>> org.apache.hive.service.cli.session.HiveSessionProxy$1:run:HiveSessionProxy.java:63,
>>> java.security.AccessController:doPrivileged:AccessController.java:-2,
>>> javax.security.auth.Subject:doAs:Subject.java:422,
>>> org.apache.hadoop.security.UserGroupInformation:doAs:UserGroupInformation.java:1875,
>>> org.apache.hive.service.cli.session.HiveSessionProxy:invoke:HiveSessionProxy.java:59,
>>> com.sun.proxy.$Proxy35:executeStatementAsync::-1,
>>> org.apache.hive.service.cli.CLIService:executeStatementAsync:CLIService.java:295,
>>> org.apache.hive.service.cli.thrift.ThriftCLIService:ExecuteStatement:ThriftCLIService.java:507,
>>> org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement:getResult:TCLIService.java:1437,
>>> org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement:getResult:TCLIService.java:1422,
>>> org.apache.thrift.ProcessFunction:process:ProcessFunction.java:39,
>>> org.apache.thrift.TBaseProcessor:process:TBaseProcessor.java:39,
>>> org.apache.hive.service.auth.TSetIpAddressProcessor:process:TSetIpAddressProcessor.java:56,
>>> org.apache.thrift.server.TThreadPoolServer$WorkerProcess:run:TThreadPoolServer.java:286,
>>> java.util.concurrent.ThreadPoolExecutor:runWorker:ThreadPoolExecutor.java:1149,
>>> java.util.concurrent.ThreadPoolExecutor$Worker:run:ThreadPoolExecutor.java:624,
>>> java.lang.Thread:run:Thread.java:748,
>>> *org.apache.hadoop.hive.ql.parse.ParseException:line 1:39 cannot recognize
>>> input near '"first_name"' 'TEXT' ',' in column name or primary key or
>>> foreign key:33:6,
>>> org.apache.hadoop.hive.ql.parse.ParseDriver:parse:ParseDriver.java:221,
>>> org.apache.hadoop.hive.ql.parse.ParseUtils:parse:ParseUtils.java:75,
>>> org.apache.hadoop.hive.ql.parse.ParseUtils:parse:ParseUtils.java:68,
>>> org.apache.hadoop.hive.ql.Driver:compile:Driver.java:564,
>>> org.apache.hadoop.hive.ql.Driver:compileInternal:Driver.java:1425,
>>> org.apache.hadoop.hive.ql.Driver:compileAndRespond:Driver.java:1398,
>>> org.apache.hive.service.cli.operation.SQLOperation:prepare:SQLOperation.java:205],
>>> sqlState:42000, errorCode:40000, errorMessage:Error while compiling
>>> statement: FAILED: ParseException line 1:39 cannot recognize input near
>>> '"first_name"' 'TEXT' ',' in column name or primary key or foreign key),
>>> Query: CREATE TABLE profile_test.person_test ("first_name" TEXT ,
>>> "last_name" TEXT , "country" TEXT ).
>>>   ... 77 more
>>>
>>>
>>> Found similar issue ion Jira :
>>>
>>> https://issues.apache.org/jira/browse/SPARK-31614
>>>
>>> There no comments in that and the resolution is Incomplete, is there
>>> any way we can do in spark to write data into the hive as JDBC mode.
>>>
>>> Thanks for any help.
>>>
>>>
>>> Thanks,
>>> Badrinath.
>>>
>>

Mime
View raw message