spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Apache Spark (JIRA)" <j...@apache.org>
Subject [jira] [Assigned] (SPARK-17676) FsHistoryProvider should ignore hidden files
Date Tue, 27 Sep 2016 03:10:20 GMT

     [ https://issues.apache.org/jira/browse/SPARK-17676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Apache Spark reassigned SPARK-17676:
------------------------------------

    Assignee: Imran Rashid  (was: Apache Spark)

> FsHistoryProvider should ignore hidden files
> --------------------------------------------
>
>                 Key: SPARK-17676
>                 URL: https://issues.apache.org/jira/browse/SPARK-17676
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>            Reporter: Imran Rashid
>            Assignee: Imran Rashid
>            Priority: Minor
>
> FsHistoryProvider currently reads hidden files (beginning with ".") from the log dir.
 However, it is writing a hidden file *itself* to that dir, which cannot be parsed, as part
of a trick to find the scan time according to the file system:
> {code}
>     val fileName = "." + UUID.randomUUID().toString
>     val path = new Path(logDir, fileName)
>     val fos = fs.create(path)
> {code}
> It does delete the tmp file immediately, but we've seen cases where that race ends badly,
and there is a logged error.  The error is harmless (the log file is ignored and spark moves
on to the other log files), but the logged error is very confusing for users, so we should
avoid it.
> {noformat}
> 2016-09-26 09:10:03,016 ERROR org.apache.spark.deploy.history.FsHistoryProvider: Exception
encountered when attempting to load application log hdfs://XXX/user/spark/applicationHistory/.3a5e987c-ace5-4568-9ccd-6285010e399a

> java.lang.IllegalArgumentException: Codec [3a5e987c-ace5-4568-9ccd-6285010e399a] is not
available. Consider setting spark.io.compression.codec=lzf 
> at org.apache.spark.io.CompressionCodec$$anonfun$createCodec$1.apply(CompressionCodec.scala:72)

> at org.apache.spark.io.CompressionCodec$$anonfun$createCodec$1.apply(CompressionCodec.scala:72)

> at scala.Option.getOrElse(Option.scala:120) 
> at org.apache.spark.io.CompressionCodec$.createCodec(CompressionCodec.scala:72) 
> at org.apache.spark.scheduler.EventLoggingListener$$anonfun$8$$anonfun$apply$1.apply(EventLoggingListener.scala:309)

> at org.apache.spark.scheduler.EventLoggingListener$$anonfun$8$$anonfun$apply$1.apply(EventLoggingListener.scala:309)

> at scala.collection.mutable.MapLike$class.getOrElseUpdate(MapLike.scala:189) 
> at scala.collection.mutable.AbstractMap.getOrElseUpdate(Map.scala:91) 
> at org.apache.spark.scheduler.EventLoggingListener$$anonfun$8.apply(EventLoggingListener.scala:309)

> at org.apache.spark.scheduler.EventLoggingListener$$anonfun$8.apply(EventLoggingListener.scala:308)

> at scala.Option.map(Option.scala:145) 
> at org.apache.spark.scheduler.EventLoggingListener$.openEventLog(EventLoggingListener.scala:308)

> at org.apache.spark.deploy.history.FsHistoryProvider.org$apache$spark$deploy$history$FsHistoryProvider$$replay(FsHistoryProvider.scala:518)

> at org.apache.spark.deploy.history.FsHistoryProvider$$anonfun$10.apply(FsHistoryProvider.scala:359)

> at org.apache.spark.deploy.history.FsHistoryProvider$$anonfun$10.apply(FsHistoryProvider.scala:356)

> at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:251)

> at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:251)

> at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) 
> at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) 
> at scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:251) 
> at scala.collection.AbstractTraversable.flatMap(Traversable.scala:105) 
> at org.apache.spark.deploy.history.FsHistoryProvider.org$apache$spark$deploy$history$FsHistoryProvider$$mergeApplicationListing(FsHistoryProvider.scala:356)
> at org.apache.spark.deploy.history.FsHistoryProvider$$anonfun$checkForLogs$1$$anon$4.run(FsHistoryProvider.scala:277)

> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
> at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
> at java.lang.Thread.run(Thread.java:745)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message