spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sean Owen <sro...@gmail.com>
Subject Re: Getting PySpark Partitions Locations
Date Thu, 25 Jun 2020 15:22:04 GMT
You can always list the S3 output path, of course.

On Thu, Jun 25, 2020 at 7:52 AM Tzahi File <tzahi.file@ironsrc.com> wrote:

> Hi,
>
> I'm using pyspark to write df to s3, using the following command:
> "df.write.partitionBy("day","hour","country").mode("overwrite").parquet(s3_output)".
>
> Is there any way to get the partitions created?
> e.g.
> day=2020-06-20/hour=1/country=US
> day=2020-06-20/hour=2/country=US
> ......
>
> --
> Tzahi File
> Data Engineer
> [image: ironSource] <http://www.ironsrc.com/>
>
> email tzahi.file@ironsrc.com
> mobile +972-546864835
> fax +972-77-5448273
> ironSource HQ - 121 Derech Menachem Begin st. Tel Aviv
> ironsrc.com <http://www.ironsrc.com/>
> [image: linkedin] <https://www.linkedin.com/company/ironsource>[image:
> twitter] <https://twitter.com/ironsource>[image: facebook]
> <https://www.facebook.com/ironSource>[image: googleplus]
> <https://plus.google.com/+ironsrc>
> This email (including any attachments) is for the sole use of the intended
> recipient and may contain confidential information which may be protected
> by legal privilege. If you are not the intended recipient, or the employee
> or agent responsible for delivering it to the intended recipient, you are
> hereby notified that any use, dissemination, distribution or copying of
> this communication and/or its content is strictly prohibited. If you are
> not the intended recipient, please immediately notify us by reply email or
> by telephone, delete this email and destroy any copies. Thank you.
>

Mime
View raw message