spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Calum Leslie <calumles...@gmail.com>
Subject Re: S3 SubFolder Write Issues
Date Wed, 11 Mar 2015 07:09:02 GMT
You want the s3n:// ("native") protocol rather than s3://. s3:// is a block
filesystem based on S3 that doesn't respect paths.

More information on the Hadoop site: https://wiki.apache.org/hadoop/AmazonS3

Calum.

On Wed, 11 Mar 2015 04:47 cpalm3 <cpalm3@gmail.com> wrote:

> Hi All,
>
> I am hoping someone has seen this issue before with S3, as I haven't been
> able to find a solution for this problem.
>
> When I try to save as Text file to s3 into a subfolder, it only ever writes
> out to the bucket level folder
> and produces block level generated file names and not my output folder as I
> specified.
> Below is the sample code in Scala, I have also seen this behavior in the
> Java code.
>
>  val out =  inputRdd.map {ir => mapFunction(ir)}.groupByKey().mapValues {
> x
> => mapValuesFunction(x) }
>    .saveAsTextFile("s3://BUCKET/SUB_FOLDER/output"
>
> Any ideas on how to get saveAsTextFile to write to an S3 subfolder?
>
> Thanks,
> Chris
>
>
>
> --
> View this message in context: http://apache-spark-user-list.
> 1001560.n3.nabble.com/S3-SubFolder-Write-Issues-tp21997.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
> For additional commands, e-mail: user-help@spark.apache.org
>
>

Mime
View raw message