spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tathagata Das <>
Subject Re: spark streaming disk hit
Date Tue, 21 Jul 2015 20:54:09 GMT
Most shuffle files are really kept around in the OS's buffer/disk cache, so
it is still pretty much in memory. If you are concerned about performance,
you have to do a holistic comparison for end-to-end performance. You could
take a look at this.

On Tue, Jul 21, 2015 at 11:57 AM, Abhishek R. Singh <> wrote:

> Is it fair to say that Storm stream processing is completely in memory,
> whereas spark streaming would take a disk hit because of how shuffle works?
> Does spark streaming try to avoid disk usage out of the box?
> -Abhishek-
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

View raw message