spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ashok34668@yahoo.com.INVALID" <ashok34...@yahoo.com.INVALID>
Subject repartition in Spark
Date Mon, 09 Nov 2020 16:56:32 GMT
Hi,
Just need some advise.
   
   - When we have multiple spark nodes running code, under what conditions a repartition make
sense?
   - Can we repartition and cache the result --> df = spark.sql("select from ...").repartition(4).cache
   - If we choose a repartition (4), will that repartition applies to all nodes running the
code and how can one see that?

Thanks


Mime
View raw message