spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Prithish <prith...@gmail.com>
Subject Re: AVRO File size when caching in-memory
Date Tue, 15 Nov 2016 05:15:10 GMT
I am using 2.0.1 and databricks avro library 3.0.1. I am running this on
the latest AWS EMR release.

On Mon, Nov 14, 2016 at 3:06 PM, Jörn Franke <jornfranke@gmail.com> wrote:

> spark version? Are you using tungsten?
>
> > On 14 Nov 2016, at 10:05, Prithish <prithish@gmail.com> wrote:
> >
> > Can someone please explain why this happens?
> >
> > When I read a 600kb AVRO file and cache this in memory (using
> cacheTable), it shows up as 11mb (storage tab in Spark UI). I have tried
> this with different file sizes, and the size in-memory is always
> proportionate. I thought Spark compresses when using cacheTable.
>

Mime
View raw message