spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Abel Coronado Iruegas <acoronadoirue...@gmail.com>
Subject SQL FIlter of tweets (json) running on Disk
Date Fri, 04 Jul 2014 14:49:08 GMT
Hi everybody

Someone can tell me if it is possible to read and filter a 60 GB file of
tweets (Json Docs) in a Standalone Spark Deployment that runs in a single
machine with 40 Gb RAM and 8 cores???

I mean, is it possible to configure Spark to work with some amount of
memory (20 GB) and the rest of the process in Disk, and avoid OutOfMemory
exceptions????

Regards

Abel

Mime
View raw message