It's probably easier if you rephrase your complexity in terms of
number of iterations and number of messages being produced per
iteration, both dependent on your N.
That should give you a better idea of feasibility with giraph.
On Thu, Mar 8, 2012 at 1:18 PM, Timmy Wilson <timmyt@smarttypes.org> wrote:
> In it's ideal form (no artificial information decay) the algo is n*n
>
> Every node communicates it's position to every other node
>
> From the node's perspective, it considers the positions of every other
> node and it's relationship to the node before making a move
>
> If n*n will scale to 5M nodes on giraph then it's simple
>
> If not, we have to condense the messaging  BarnesHut simulation
> using an Octree or Quadtree structure is one option 
> http://en.wikipedia.org/wiki/Barnes%E2%80%93Hut_simulation  this
> reduces the algo to n*log(n), but adds an additional graph to consider
>
>
>
> On Wed, Mar 7, 2012 at 11:57 AM, Claudio Martella
> <claudio.martella@gmail.com> wrote:
>> Ok, but why a bipartite graph? I can imagine you can have 2 or 3
>> coordinates associated with each vertices without actually having multiple
>> vertex types.
>>
>>
>> On Wednesday, March 7, 2012, Timmy Wilson <timmyt@smarttypes.org> wrote:
>>>> while i'd expect from linlog something like an (x,y) coordinate. is
>>>> that correct?
>>>
>>> (x,y,z) is also common  i think of it as extreme dimension reduction
>>>  after which you're free to use your favorite gis tool 
>>> http://postgis.refractions.net/  you can also pull in tools from
>>> statistical mechanics/thermodynamics
>>>
>>> Without complicating the algo for efficiency  the essence of the
>>> energy model is simple  all nodes repulse each other  connected
>>> nodes attract each other  this works for any type of graph  i'm
>>> using a weighted directed graph  in which case the weighs influence
>>> the attractive force
>>>
>>> It's all very continuous/natural  living in a 3d world we've build a
>>> lot of tools/methods to process this kind of information
>>>
>>> The problem is without efficiency methods like BarnesHut we have
>>> Cartesian(x,y) or Cartesian(x,y,z)  because every vertex influences
>>> every other vertex
>>>
>>> An efficient giraph implementation would need 2 layers  a bipartite
>>> graph  assuming (x,y,z) reduction  we would need an Octree graph
>>> interacting dynamically w/ our graph of interest
>>>
>>>
>>> On Wed, Mar 7, 2012 at 5:00 AM, Claudio Martella
>>> <claudio.martella@gmail.com> wrote:
>>>> My personal take is that they do have similar function (they "extract"
>>>> communities), but they have a general different type of output. in
>>>> label propagation you'd end up with an id for each vertex (the vertex
>>>> id that is the centroid for the community each vertex belongs to),
>>>> while i'd expect from linlog something like an (x,y) coordinate. is
>>>> that correct?
>>>>
>>>> On Wed, Mar 7, 2012 at 2:45 AM, Timmy Wilson <timmyt@smarttypes.org>
>> wrote:
>>>>> Thank you everyone!
>>>>>
>>>>> I would love to see a comparison of force directed layouts
>>>>> (specifically LinLog) and label propagation.
>>>>>
>>>>> I searched but alas nothing  they seem to be oddly similar?
>>>>>
>>>>> The current, serial LinLog implementation 
>>>>> http://code.google.com/p/linloglayout/  uses BarnesHut simulation
>>>>>  n*log(n):
>>>>>
>>>>> http://en.wikipedia.org/wiki/Barnes%E2%80%93Hut_simulation
>>>>>
>>>>> I guess the root question is  do you think it's reasonable to use
>>>>> giraph for BarnesHut simulation?
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Tue, Mar 6, 2012 at 11:34 AM, Claudio Martella
>>>>> <claudio.martella@gmail.com> wrote:
>>>>>> Hi,
>>>>>>
>>>>>> I'm not definitely familiar with the algorithm or implementation
of
>>>>>> LinLog, I've been just a user. It should be doable with Giraph if
you
>>>>>> can express it in terms of messagepassing between vertices and
>>>>>> without a dependency on a global view of the graph (except for the
>>>>>> convergence criteria, such as total energy).
>>>>>>
>>>>>> Please consider that Giraph's data model is based on a directed graph,
>>>>>> this should be a quite "interesting" constraint for you, if your
>>>>>> implementation is going to modify energy associated with edges (you'd
>>>>>> have two views over the undirected edge, one in each endpoint).
>>>>>>
>>>>>> In general, a good way of doing community analysis would be to look
at
>>>>>> algorithms that belong to the family of labelpropagation clustering
>>>>>> algorithms.
>>>>>>
>>>>>>
>>>>>> Hope this helps,
>>>>>> Claudio
>>>>>>
>>>>>> On Tue, Mar 6, 2012 at 3:28 PM, Timmy Wilson <timmyt@smarttypes.org>
>> wrote:
>>>>>>> Hi giraph community,
>>>>>>>
>>>>>>> I'm interested in using giraph for distributed nbody simulation.
>>>>>>>
>>>>>>> Initially, i'm interested in force directed layouts  ie, graph
>> drawing:
>>>>>>>
>>>>>>> http://en.wikipedia.org/wiki/Forcebased_algorithms_(graph_drawing)
>>>>>>>
>>>>>>> I'm interested specifically in Dr. Andreas Noack's LinLog energy
model
>>>>>>>  which performs well w/ community detection:
>>>>>>>
>>>>>>> http://www.informatik.tucottbus.de/~an/GD/linlog.html
>>>>>>>
>>>>>>> I have a few examples of a serial implementation here:
>>>>>>>
>>>>>>> http://www.smarttypes.org/
>>>>>>>
>>>>>>> The model maximizes the distance between all nodes while minimizing
>>>>>>> the distance between connected nodes.
>>>>>>>
>>>>>>> Without getting into too much detail, i'm curious if anyone has
>>>>>>> considered using giraph for force directed graph embedding (yet
>>>>>>> another name for it)?
>>>>>>>
>>>>>>> I'm also considering something like http://www.mcs.anl.gov/petsc/
or
>>>>>>> http://www.cs.cmu.edu/~scandal/alg/nbody.html  which have fast
>>>
>>
>> 
>> Claudio Martella
>> claudio.martella@gmail.com

Claudio Martella
claudio.martella@gmail.com
