spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matthew Cheah <matthew.c.ch...@gmail.com>
Subject Re: "Too many open files" exception on reduceByKey
Date Tue, 11 Mar 2014 20:35:55 GMT
Thanks. Just curious, is there a default number of reducers that are used?

-Matt Cheah


On Mon, Mar 10, 2014 at 7:22 PM, Patrick Wendell <pwendell@gmail.com> wrote:

> Hey Matt,
>
> The best way is definitely just to increase the ulimit if possible,
> this is sort of an assumption we make in Spark that clusters will be
> able to move it around.
>
> You might be able to hack around this by decreasing the number of
> reducers but this could have some performance implications for your
> job.
>
> In general if a node in your cluster has C assigned cores and you run
> a job with X reducers then Spark will open C*X files in parallel and
> start writing. Shuffle consolidation will help decrease the total
> number of files created but the number of file handles open at any
> time doesn't change so it won't help the ulimit problem.
>
> This means you'll have to use fewer reducers (e.g. pass reduceByKey a
> number of reducers) or use fewer cores on each machine.
>
> - Patrick
>
> On Mon, Mar 10, 2014 at 10:41 AM, Matthew Cheah
> <matthew.c.cheah@gmail.com> wrote:
> > Hi everyone,
> >
> > My team (cc'ed in this e-mail) and I are running a Spark reduceByKey
> > operation on a cluster of 10 slaves where I don't have the privileges to
> set
> > "ulimit -n" to a higher number. I'm running on a cluster where "ulimit
> -n"
> > returns 1024 on each machine.
> >
> > When I attempt to run this job with the data originating from a text
> file,
> > stored in an HDFS cluster running on the same nodes as the Spark cluster,
> > the job crashes with the message, "Too many open files".
> >
> > My question is, why are so many files being created, and is there a way
> to
> > configure the Spark context to avoid spawning that many files? I am
> already
> > setting spark.shuffle.consolidateFiles to true.
> >
> > I want to repeat - I can't change the maximum number of open file
> > descriptors on the machines. This cluster is not owned by me and the
> system
> > administrator is responding quite slowly.
> >
> > Thanks,
> >
> > -Matt Cheah
>

Mime
View raw message