spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Md. Rezaul Karim" <rezaul.ka...@insight-centre.org>
Subject How to reduce number of tasks and partitions in Spark job?
Date Thu, 26 Jan 2017 17:13:02 GMT
Hi All,

When I run a Spark job on my local machine (having 8 cores and 16GB of RAM)
on an input data of 6.5GB, it creates 193 parallel tasks and put
the output into 193 partitions.

How can I change the number of tasks and consequently, the number of output
files - say to just one or less?





Regards,
_________________________________
*Md. Rezaul Karim*, BSc, MSc
PhD Researcher, INSIGHT Centre for Data Analytics
National University of Ireland, Galway
IDA Business Park, Dangan, Galway, Ireland
Web: http://www.reza-analytics.eu/index.html
<http://139.59.184.114/index.html>

Mime
View raw message