spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Patrick Wendell (JIRA)" <>
Subject [jira] [Comment Edited] (SPARK-5152) Let file take an hdfs:// path
Date Fri, 09 Jan 2015 06:20:35 GMT


Patrick Wendell edited comment on SPARK-5152 at 1/9/15 6:19 AM:

Should we be loading the metrics properties on executors in the first place? Maybe that's
the issue. I haven't looked at the code in a while but I'm not sure people use this in a way
where they expect to be able to query executors for metrics.

was (Author: pwendell):
Should we be loading the metrics properties on executors in the first place? Maybe that's
the issue. Since executors are ephemeral you can't query them for any metrics anyways, right?

> Let file take an hdfs:// path
> ------------------------------------------------
>                 Key: SPARK-5152
>                 URL:
>             Project: Spark
>          Issue Type: Improvement
>          Components: Spark Core
>    Affects Versions: 1.2.0
>            Reporter: Ryan Williams
> From my reading of [the code|],
the {{spark.metrics.conf}} property must be a path that is resolvable on the local filesystem
of each executor.
> Running a Spark job with {{--conf spark.metrics.conf=hdfs://}}
logs many errors (~1 per executor, presumably?) like:
> {code}
> 15/01/08 13:20:57 ERROR metrics.MetricsConfig: Error loading configure file
> hdfs:/ (No such
file or directory)
>         at Method)
>         at<init>(
>         at<init>(
>         at org.apache.spark.metrics.MetricsConfig.initialize(MetricsConfig.scala:53)
>         at org.apache.spark.metrics.MetricsSystem.<init>(MetricsSystem.scala:92)
>         at org.apache.spark.metrics.MetricsSystem$.createMetricsSystem(MetricsSystem.scala:218)
>         at org.apache.spark.SparkEnv$.create(SparkEnv.scala:329)
>         at org.apache.spark.SparkEnv$.createExecutorEnv(SparkEnv.scala:181)
>         at org.apache.spark.executor.CoarseGrainedExecutorBackend$$anonfun$run$1.apply$mcV$sp(CoarseGrainedExecutorBackend.scala:131)
>         at org.apache.spark.deploy.SparkHadoopUtil$$anon$
>         at org.apache.spark.deploy.SparkHadoopUtil$$anon$
>         at Method)
>         at
>         at
>         at org.apache.spark.deploy.SparkHadoopUtil.runAsSparkUser(SparkHadoopUtil.scala:60)
>         at org.apache.spark.executor.CoarseGrainedExecutorBackend$.run(CoarseGrainedExecutorBackend.scala:113)
>         at org.apache.spark.executor.CoarseGrainedExecutorBackend$.main(CoarseGrainedExecutorBackend.scala:163)
>         at org.apache.spark.executor.CoarseGrainedExecutorBackend.main(CoarseGrainedExecutorBackend.scala)
> {code}
> which seems consistent with the idea that it's looking on the local filesystem and not
parsing the "scheme" portion of the URL.
> Letting all executors get their {{}} files from one location on HDFS
would be an improvement, right?

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message