I got a 10g limitation on the executors and operating on parquet dataset with block size 70M with 200 blocks. I keep hitting the memory limits when doing a 'select * from t1 order by c1 limit 1000000' (ie, 1M). It works if I limit to say 100k. What are the options to save a large dataset without running into memory issues?

Thanks!