spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ramkumar Chokkalingam <>
Subject Controlling the name of the output file
Date Thu, 10 Oct 2013 11:18:03 GMT

I'm writing reading multiple files, parsing them, and writing to an output
file. As I see it, SaveAsTextFile takes the output path and emits the
output under the directory we specify as file named part-00000, part-00001
etc depending on the number of clusters used ( similar to Hadoop).But is
there a way, where you can make all your input files to be emitted in a
single output folder ? Also, do we have control over the output file name
(Different name rather than part-0000's) ?

View raw message