spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Daniel Siegmann <dsiegm...@securityscorecard.io>
Subject Re: More instances = slower Spark job
Date Thu, 28 Sep 2017 13:26:35 GMT
> no matter what you do and how many nodes you start, in case you have a
> single text file, it will not use parallelism.
>

This is not true, unless the file is small or is gzipped (gzipped files
cannot be split).

Mime
View raw message