spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From nit <>
Subject Re: Readin from Amazon S3 behaves inconsistently: return different number of lines...
Date Fri, 01 Aug 2014 15:09:08 GMT
@sean - I am using latest code from master branch, up to commit#
a7d145e98c55fa66a541293930f25d9cdc25f3b4 .

In my case I have multiple directories with 1024 files(in that sizes of
files may be different). For some directories I always get consistent
result... and for others I can reproduce the inconsistent behavior. 

I am not much familiar with S3 protocol or s3 driver in spark. I am
wondering, how does s3 driver verifies that all files(and their content)
under a directory were correctly?

View this message in context:
Sent from the Apache Spark User List mailing list archive at

View raw message