spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sam <games2013....@gmail.com>
Subject Avro file question
Date Mon, 04 Nov 2019 17:03:29 GMT
Hi,

How do we choose between single large avro file (size much larger than HDFS
block size) vs multiple smaller avro files (close to HDFS block size?

Since avro is splittable, is there even a need to split a very large avro
file into smaller files?

I’m assuming that a single large avro file can also be split into multiple
mappers/reducers/executors during processing.

Thanks.

Mime
View raw message