spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Praveen Rachabattuni <>
Subject Re-distribute cache on new slave nodes for better performance
Date Mon, 10 Mar 2014 18:06:14 GMT
I have observed a query responding faster when dataset A is cached on 2
slave nodes rather than on 1 slave node.
I wanted to add more slave nodes and check the performance  but I can only
use the new node when data is re-cached.

Is there any way the cached dataset can be re-distributed(lesser time?)
when a new slave node is added or please let me know if I am perceiving
something wrong.

Praveen R

View raw message