ignite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "mark pettovello (JIRA)" <j...@apache.org>
Subject [jira] [Created] (IGNITE-8165) Spark Dataset Write intermittent "Failed to map key to node" error
Date Fri, 06 Apr 2018 13:59:00 GMT
mark pettovello created IGNITE-8165:
---------------------------------------

             Summary: Spark Dataset Write intermittent "Failed to map key to node" error 
                 Key: IGNITE-8165
                 URL: https://issues.apache.org/jira/browse/IGNITE-8165
             Project: Ignite
          Issue Type: Bug
          Components: jdbc, spark
    Affects Versions: 2.4
         Environment: Spark 2.1.0

Java 1.8.0_152

Ignite-core-2.4.0.jar

ignite-spark_2.10-2.4.0.jar

Scala 2.11.8

 

 
            Reporter: mark pettovello


Inserts partially fail when issuing a Dataset<Row>  write() operation.  Rerunning
write operation causes different sets of rows fail to insert.  Not all of the rows in dsCity.show()
are inserted into Ignite.  All random missing rows encountered "Failed to map key to node"
exception.

 

SparkSession spark = SparkSession
 .builder()
 .appName("IgniteSQLDataSource example")
.master("local[4]")//run local PC using Winutils
.config("spark.local.dir","/tmp")
 .getOrCreate();

 ... create about 10 \{(int) ID, (string) NAME} tuples and add them to the dsCity dataset
...

Dataset<Row> dsCity = spark.createDataset(...).toDF("ID","NAME");

dsCity.show(1000);

String tblName = "CITY";
String jdbcURL = "jdbc:ignite:thin://127.0.0.1/";

 

dsCity.write()
 .format("jdbc")
.option("primary_key_fields", "ID")
 .option("url", jdbcURL)
 .option("driver", "org.apache.ignite.IgniteJdbcThinDriver")
 .option("batchsize", 1000)
 .option("dbtable", tblName)
.mode(SaveMode.Append)
 .save();

 

18/04/06 09:33:23 ERROR Executor: Exception in task 3.0 in stage 2.0 (TID 5)
java.sql.BatchUpdateException: Failed to map key to node.
 at org.apache.ignite.internal.jdbc.thin.JdbcThinStatement.executeBatch(JdbcThinStatement.java:435)
 at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.savePartition(JdbcUtils.scala:597)
 at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$saveTable$1.apply(JdbcUtils.scala:670)
 at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$saveTable$1.apply(JdbcUtils.scala:670)
 at org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1$$anonfun$apply$29.apply(RDD.scala:925)
 at org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1$$anonfun$apply$29.apply(RDD.scala:925)
 at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1944)
 at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1944)
 at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
 at org.apache.spark.scheduler.Task.run(Task.scala:99)
 at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:282)
 at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
 at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
 at java.lang.Thread.run(Thread.java:748)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message