spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sean Owen <sro...@gmail.com>
Subject Re: saveAsTable in 2.3.2 throws IOException while 2.3.1 works fine?
Date Sun, 30 Sep 2018 20:25:36 GMT
Hm, changes in the behavior of the default warehouse dir sound
familiar, but anything I could find was resolved well before 2.3.1
even. I don't know of a change here. What location are you expecting?
https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315420&version=12343289
On Sun, Sep 30, 2018 at 1:38 PM Jacek Laskowski <jacek@japila.pl> wrote:
>
> Hi Sean,
>
> I thought so too, but the path "file:/user/hive/warehouse/" should not have been used
in the first place, should it? I'm running it in spark-shell 2.3.2. Why would there be any
changes between 2.3.1 and 2.3.2 that I just downloaded and one worked fine while the other
did not? I had to downgrade to 2.3.1 because of this (and do want to figure out why 2.3.2
behaves in a different way).
>
> The part of the stack trace is below.
>
> ➜  spark-2.3.2-bin-hadoop2.7 ./bin/spark-shell
> 2018-09-30 17:43:49 WARN  NativeCodeLoader:62 - Unable to load native-hadoop library
for your platform... using builtin-java classes where applicable
> Setting default log level to "WARN".
> To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
> Spark context Web UI available at http://192.168.0.186:4040
> Spark context available as 'sc' (master = local[*], app id = local-1538322235135).
> Spark session available as 'spark'.
> Welcome to
>       ____              __
>      / __/__  ___ _____/ /__
>     _\ \/ _ \/ _ `/ __/  '_/
>    /___/ .__/\_,_/_/ /_/\_\   version 2.3.2
>       /_/
>
> Using Scala version 2.11.8 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_171)
> Type in expressions to have them evaluated.
> Type :help for more information.
>
> scala> spark.version
> res0: String = 2.3.2
>
> scala> spark.range(1).write.saveAsTable("demo")
> 2018-09-30 17:44:27 WARN  ObjectStore:568 - Failed to get database global_temp, returning
NoSuchObjectException
> 2018-09-30 17:44:28 ERROR FileOutputCommitter:314 - Mkdirs failed to create file:/user/hive/warehouse/demo/_temporary/0
> 2018-09-30 17:44:28 ERROR Utils:91 - Aborting task
> java.io.IOException: Mkdirs failed to create file:/user/hive/warehouse/demo/_temporary/0/_temporary/attempt_20180930174428_0000_m_000007_0
(exists=false, cwd=file:/Users/jacek/dev/apps/spark-2.3.2-bin-hadoop2.7)
> at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:455)
> at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:440)
> at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:911)
> at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:892)
> at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:789)
> at org.apache.parquet.hadoop.ParquetFileWriter.<init>(ParquetFileWriter.java:241)
> at org.apache.parquet.hadoop.ParquetOutputFormat.getRecordWriter(ParquetOutputFormat.java:342)
> at org.apache.parquet.hadoop.ParquetOutputFormat.getRecordWriter(ParquetOutputFormat.java:302)
> at org.apache.spark.sql.execution.datasources.parquet.ParquetOutputWriter.<init>(ParquetOutputWriter.scala:37)
> at org.apache.spark.sql.execution.datasources.parquet.ParquetFileFormat$$anon$1.newInstance(ParquetFileFormat.scala:151)
> at org.apache.spark.sql.execution.datasources.FileFormatWriter$SingleDirectoryWriteTask.newOutputWriter(FileFormatWriter.scala:367)
> at org.apache.spark.sql.execution.datasources.FileFormatWriter$SingleDirectoryWriteTask.execute(FileFormatWriter.scala:378)
> at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:269)
> at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:267)
> at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1415)
> at org.apache.spark.sql.execution.datasources.FileFormatWriter$.org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask(FileFormatWriter.scala:272)
> at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1.apply(FileFormatWriter.scala:197)
> at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1.apply(FileFormatWriter.scala:196)
> at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
> at org.apache.spark.scheduler.Task.run(Task.scala:109)
> at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
>
>
> Pozdrawiam,
> Jacek Laskowski
> ----
> https://about.me/JacekLaskowski
> Mastering Spark SQL https://bit.ly/mastering-spark-sql
> Spark Structured Streaming https://bit.ly/spark-structured-streaming
> Mastering Kafka Streams https://bit.ly/mastering-kafka-streams
> Follow me at https://twitter.com/jaceklaskowski
>
>
> On Sat, Sep 29, 2018 at 9:50 PM Sean Owen <srowen@gmail.com> wrote:
>>
>> Looks like a permission issue? Are you sure that isn't the difference, first?
>>
>> On Sat, Sep 29, 2018, 1:54 PM Jacek Laskowski <jacek@japila.pl> wrote:
>>>
>>> Hi,
>>>
>>> The following query fails in 2.3.2:
>>>
>>> scala> spark.range(10).write.saveAsTable("t1")
>>> ...
>>> 2018-09-29 20:48:06 ERROR FileOutputCommitter:314 - Mkdirs failed to create file:/user/hive/warehouse/bucketed/_temporary/0
>>> 2018-09-29 20:48:07 ERROR Utils:91 - Aborting task
>>> java.io.IOException: Mkdirs failed to create file:/user/hive/warehouse/bucketed/_temporary/0/_temporary/attempt_20180929204807_0000_m_000003_0
(exists=false, cwd=file:/Users/jacek/dev/apps/spark-2.3.2-bin-hadoop2.7)
>>> at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:455)
>>> at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:440)
>>> at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:911)
>>>
>>> While it works fine in 2.3.1.
>>>
>>> Could anybody explain the change in behaviour in 2.3.2? The commit / the JIRA
issue would be even nicer. Thanks.
>>>
>>> Pozdrawiam,
>>> Jacek Laskowski
>>> ----
>>> https://about.me/JacekLaskowski
>>> Mastering Spark SQL https://bit.ly/mastering-spark-sql
>>> Spark Structured Streaming https://bit.ly/spark-structured-streaming
>>> Mastering Kafka Streams https://bit.ly/mastering-kafka-streams
>>> Follow me at https://twitter.com/jaceklaskowski

---------------------------------------------------------------------
To unsubscribe e-mail: dev-unsubscribe@spark.apache.org


Mime
View raw message