spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Md. Rezaul Karim" <>
Subject How to reduce number of tasks and partitions in Spark job?
Date Thu, 26 Jan 2017 17:13:02 GMT
Hi All,

When I run a Spark job on my local machine (having 8 cores and 16GB of RAM)
on an input data of 6.5GB, it creates 193 parallel tasks and put
the output into 193 partitions.

How can I change the number of tasks and consequently, the number of output
files - say to just one or less?

*Md. Rezaul Karim*, BSc, MSc
PhD Researcher, INSIGHT Centre for Data Analytics
National University of Ireland, Galway
IDA Business Park, Dangan, Galway, Ireland

View raw message