spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrew Or (JIRA)" <>
Subject [jira] [Updated] (SPARK-12007) Network library's RPC layer requires a lot of copying
Date Tue, 01 Dec 2015 00:11:11 GMT


Andrew Or updated SPARK-12007:
    Assignee: Marcelo Vanzin

> Network library's RPC layer requires a lot of copying
> -----------------------------------------------------
>                 Key: SPARK-12007
>                 URL:
>             Project: Spark
>          Issue Type: Improvement
>          Components: Spark Core
>    Affects Versions: 1.6.0
>            Reporter: Marcelo Vanzin
>            Assignee: Marcelo Vanzin
> The network library's RPC layer has an external API based on byte arrays, instead of
ByteBuffer; that requires a lot of copying since the internals of the library use ByteBuffers
(or rather Netty's ByteBuf), and lots of external clients also use ByteBuffer.
> The extra copies could be avoided if the API used ByteBuffer instead.
> To show an extreme case, look at an RPC send via NettyRpcEnv:
> - message is encoded using JavaSerializer, resulting in a ByteBuffer
> - the ByteBuffer is copied into a byte array of the right size, since its internal array
may be larger than the actual data it holds
> - the network library's encoder copies the byte array into a ByteBuf
> - finally the data is written to the socket
> The intermediate 2 copies could be avoided if the API allowed the original ByteBuffer
to be sent instead.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message