spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Abel Coronado Iruegas <>
Subject SQL FIlter of tweets (json) running on Disk
Date Fri, 04 Jul 2014 14:49:08 GMT
Hi everybody

Someone can tell me if it is possible to read and filter a 60 GB file of
tweets (Json Docs) in a Standalone Spark Deployment that runs in a single
machine with 40 Gb RAM and 8 cores???

I mean, is it possible to configure Spark to work with some amount of
memory (20 GB) and the rest of the process in Disk, and avoid OutOfMemory



View raw message