giraph-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Pavan Kumar (JIRA)" <>
Subject [jira] [Commented] (GIRAPH-874) Specialized byte array partitions
Date Wed, 02 Apr 2014 01:04:37 GMT


Pavan Kumar commented on GIRAPH-874:

I agree that primitive collections improve performance, but why do u say a new "vertex object",
has to be created to store in the map?
Vertex objects are already created and being assigned to partitions. However, in giraph-873
this argument is valid because of line 130 in i.e.,

Can you please elaborate?
Also in the diff you can reduce all the duplication using delegation, for example please look
at GIRAPH-840 of how ByteCounter was split into InBoundByteCounter, OutBoundByteCounter, through

> Specialized byte array partitions
> ---------------------------------
>                 Key: GIRAPH-874
>                 URL:
>             Project: Giraph
>          Issue Type: Improvement
>          Components: graph
>    Affects Versions: 1.1.0
>            Reporter: Craig Muchinsky
>             Fix For: 1.1.0
>         Attachments: GIRAPH-874-2.patch, GIRAPH-874.patch
> While doing some performance tuning I discovered that loading byte array partitions was
performing slower than expected. I found that the extra time was being spent allocating a
new vertex object for each distinct vertexId encountered (because vertexId object is the map
key). Similar to GIRAPH-704, the use of primitive maps can provide significant performance
benefit in this situation. By using a primitive map, the vertex object on the VertexIterator
can be reused perpetually because the vertexId object isn't used as the map key.
> When processing a large graph with 4B vertices the worker vertices requests were taking
~15 seconds each, but after implementing the above suggestion that number dropped down sub-second.

This message was sent by Atlassian JIRA

View raw message