spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "yucai (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (SPARK-12196) Store/retrieve blocks in different speed storage devices by hierarchy way
Date Fri, 01 Jan 2016 02:19:39 GMT

     [ https://issues.apache.org/jira/browse/SPARK-12196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

yucai updated SPARK-12196:
--------------------------
    Description: 
*Motivation*
Nowadays, customers have both SSDs(SATA SSD/PCIe SSD) and HDDs. 
SSDs have great performance, but capacity is small. 
HDDs have good capacity, but much slower than SSDs(x2-x3 slower than SATA SSD, x20 slower
than PCIe SSD).
How can we get both good?

*Proposal*
One solution is to build hierarchy store: use SSDs as cache and HDDs as backup storage. 
When Spark core allocates blocks (either for shuffle or RDD cache), it gets blocks from SSDs
first, and when SSD’s useable space is less than some threshold, getting blocks from HDDs.

In our implementation, we actually go further. We support a way to build any level hierarchy
store access various storage medias (MEM, NVM, SSD, HDD etc.).

*Performance*
1. At the best case, our solution performs the same as all SSDs.
2. At the worst case, like all data are spilled to HDDs, no performance regression.
3. Compared with all HDDs, hierarchy store improves more than *_x1.86_* (it could be higher,
CPU reaches bottleneck in our test environment).
4. Compared with Tachyon, our hierarchy store still *_x1.3_* faster. Because we support both
RDD cache and shuffle and no extra inter process communication.

*Test Environment*
1. 4 IVB box(40 cores, 192GB memory, 10GB Nic, 11HDDs/11SATA SSDs/PCIE SSD) 
2. Real customer case NWeight(graph analysis), which is to compute associations between two
vertices that are n-hop away(e.g., friend-to-friend or video-to-video relationship for recommendation).

3. Data Size: 22GB, Vertices: 41 milion, Edges: 1.4 billion.

*Usage*
1. Set the priority and threshold for each layer in spark.storage.hierarchyStore.
{code}
spark.storage.hierarchyStore='nvm 40GB,ssd 20GB'
{code}
It builds a 3 layers hierarchy store: the 1st is "nvm", the 2nd is "sdd", all the rest form
the last layer.

2. Configure each layer's location, user just needs put the keyword like "nvm", "ssd", which
are specified in step 1, into local dirs, like spark.local.dir or yarn.nodemanager.local-dirs.
{code}
spark.local.dir=/mnt/nvm1,/mnt/ssd1,/mnt/ssd2,/mnt/ssd3,/mnt/disk1,/mnt/disk2,/mnt/disk3,/mnt/disk4,/mnt/others
{code}

After then, restart your Spark application, it will allocate blocks from nvm first.
When nvm's usable space is less than 40GB, it starts to allocate from ssd.
When ssd's usable space is less than 20GB, it starts to allocate from the last layer.

  was:
*Motivation*
Nowadays, customers have both SSDs(SATA SSD/PCIe SSD) and HDDs. 
SSDs have great performance, but capacity is small. 
HDDs have good capacity, but much slower than SSDs(x2-x3 slower than SATA SSD, x20 slower
than PCIe SSD).
How can we get both good?

*Proposal*
Our idea is to build hierarchy store: use SSDs as cache and HDDs as backup storage. 
When Spark core allocates blocks (either for shuffle or RDD cache), it gets blocks from SSDs
first, and when SSD’s useable space is less than some threshold, getting blocks from HDDs.

In our implementation, we actually go further. We support a way to build any level hierarchy
store access various storage medias (MEM, NVM, SSD, HDD etc.).

*Performance*
1. At the best case, our solution performs the same as all SSDs.
2. At the worst case, like all data are spilled to HDDs, no performance regression.
3. Compared with all HDDs, hierarchy store improves more than *_x1.86_* (it could be higher,
CPU reaches bottleneck in our test environment).
4. Compared with Tachyon, our hierarchy store still *_x1.3_* faster. Because we support both
RDD cache and shuffle and no extra inter process communication.

*Test Environment*
1. 4 IVB box(40 cores, 192GB memory, 10GB Nic, 11HDDs/11SATA SSDs/PCIE SSD) 
2. Real customer case NWeight(graph analysis), which is to compute associations between two
vertices that are n-hop away(e.g., friend-to-friend or video-to-video relationship for recommendation).

3. Data Size: 22GB, Vertices: 41 milion, Edges: 1.4 billion.

*Usage*
1. Set the priority and threshold for each layer in spark.storage.hierarchyStore.
{code}
spark.storage.hierarchyStore='nvm 40GB,ssd 20GB'
{code}
It builds a 3 layers hierarchy store: the 1st is "nvm", the 2nd is "sdd", all the rest form
the last layer.

2. Configure each layer's location, user just needs put the keyword like "nvm", "ssd", which
are specified in step 1, into local dirs, like spark.local.dir or yarn.nodemanager.local-dirs.
{code}
spark.local.dir=/mnt/nvm1,/mnt/ssd1,/mnt/ssd2,/mnt/ssd3,/mnt/disk1,/mnt/disk2,/mnt/disk3,/mnt/disk4,/mnt/others
{code}

After then, restart your Spark application, it will allocate blocks from nvm first.
When nvm's usable space is less than 40GB, it starts to allocate from ssd.
When ssd's usable space is less than 20GB, it starts to allocate from the last layer.


> Store/retrieve blocks in different speed storage devices by hierarchy way
> -------------------------------------------------------------------------
>
>                 Key: SPARK-12196
>                 URL: https://issues.apache.org/jira/browse/SPARK-12196
>             Project: Spark
>          Issue Type: New Feature
>          Components: Spark Core
>            Reporter: yucai
>
> *Motivation*
> Nowadays, customers have both SSDs(SATA SSD/PCIe SSD) and HDDs. 
> SSDs have great performance, but capacity is small. 
> HDDs have good capacity, but much slower than SSDs(x2-x3 slower than SATA SSD, x20 slower
than PCIe SSD).
> How can we get both good?
> *Proposal*
> One solution is to build hierarchy store: use SSDs as cache and HDDs as backup storage.

> When Spark core allocates blocks (either for shuffle or RDD cache), it gets blocks from
SSDs first, and when SSD’s useable space is less than some threshold, getting blocks from
HDDs.
> In our implementation, we actually go further. We support a way to build any level hierarchy
store access various storage medias (MEM, NVM, SSD, HDD etc.).
> *Performance*
> 1. At the best case, our solution performs the same as all SSDs.
> 2. At the worst case, like all data are spilled to HDDs, no performance regression.
> 3. Compared with all HDDs, hierarchy store improves more than *_x1.86_* (it could be
higher, CPU reaches bottleneck in our test environment).
> 4. Compared with Tachyon, our hierarchy store still *_x1.3_* faster. Because we support
both RDD cache and shuffle and no extra inter process communication.
> *Test Environment*
> 1. 4 IVB box(40 cores, 192GB memory, 10GB Nic, 11HDDs/11SATA SSDs/PCIE SSD) 
> 2. Real customer case NWeight(graph analysis), which is to compute associations between
two vertices that are n-hop away(e.g., friend-to-friend or video-to-video relationship for
recommendation). 
> 3. Data Size: 22GB, Vertices: 41 milion, Edges: 1.4 billion.
> *Usage*
> 1. Set the priority and threshold for each layer in spark.storage.hierarchyStore.
> {code}
> spark.storage.hierarchyStore='nvm 40GB,ssd 20GB'
> {code}
> It builds a 3 layers hierarchy store: the 1st is "nvm", the 2nd is "sdd", all the rest
form the last layer.
> 2. Configure each layer's location, user just needs put the keyword like "nvm", "ssd",
which are specified in step 1, into local dirs, like spark.local.dir or yarn.nodemanager.local-dirs.
> {code}
> spark.local.dir=/mnt/nvm1,/mnt/ssd1,/mnt/ssd2,/mnt/ssd3,/mnt/disk1,/mnt/disk2,/mnt/disk3,/mnt/disk4,/mnt/others
> {code}
> After then, restart your Spark application, it will allocate blocks from nvm first.
> When nvm's usable space is less than 40GB, it starts to allocate from ssd.
> When ssd's usable space is less than 20GB, it starts to allocate from the last layer.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message