mina-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bernd Fondermann <bf_...@brainlounge.de>
Subject Re: [vysper] cloning and forwarding stanzas
Date Fri, 04 Sep 2009 08:41:03 GMT
Fernando Padilla wrote:
> Well, here is a sample pseudo code for creating an iterable of MUC
> stanzas..
> 
> {
> List<String> listOfJid = Lists.newArrayList( "me@example.com/A",
> "you@example.com/B" );
> Transformer<String,Stanza> transformer = new StanzaCloneWithNewTo(
> templateStanza );
> Iterable<Stanza> stanzas = Iterables.transform( listOfJid, transformer );
> }
> 
> So the idea is that the Relay would accept an Iterator<Stanza>, which
> then it can process one by one by simply iterating through them:
> for ( Stanza stanza : stanzas ) {
>   // relay stanza
> }
> 
> 
> Ok, now I'm going off the deep end:
> 
> This concept of being able to send a collection or iterable through the
> relay ( instead of a single message ), would be used as well when Vysper
> is clustered.. So that the ClusterRelay would batch forwarding stanzas
> between nodes.  In fact, we might even expose the
> send"StanzaCloneWithNewTo" operation in the Relay api, because the
> ClusterRelay could have a huge optimization by only sending the template
> stanza once along with the appropriate jids to their respective nodes..
> (instead of all fully constructed stanzas).
> 
> Actually, we can probably generalize that a little bit as a basic
> building block for distributing components too: into run this code on
> this jid on the node the jid lives on.. So the Relay/Router would cut up
> the list of jids into sublists for each appropriate node (like
> me@example.com/A is connected to NodeA, while you@example/B is connected
> to NodeB), then it would forward the serializable work unit, along with
> the sublist of jid to execute on the approribate node:
> 
> NodeA:execute(me@example.com/A)
> NodeB:execute(you@example.com/B)
> 
> Ok, I'm getting excited, but might have started rambling.  What do you
> guys think so far?

I think that clustering is an important consideration. With thousands of
user session, you'd have to partition them between machines. This brings
us into a completely new ballpark of problems. (Hint: see Twitter
scaling problems). We would profite from sending a list of TOs and a
template stanza to other nodes.

Your outline seems reasonable.

Changing relay from taking one stanza to an StanzaGeneratorIterator
would be clean to support a number of use cases.

Would you like to open a JIRA for this and track the status of
discussion there so this doesn't get lost?

  Bernd

> 
> 
> On 9/2/09 10:17 AM, Bernd Fondermann wrote:
>> On Wed, Sep 2, 2009 at 18:06, Fernando Padilla<fern@alum.mit.edu>  wrote:
>>   
>>> Right.  The idea is that you should have methods and returns be as
>>> low as
>>> possible ( Iterable ), instead of higher requirements ( Collection,
>>> List,
>>> Set, etc ).  This just gives you some flexibility on how to implement
>>> things, and reduce the number of data structures you have to maintain or
>>> even iterate through.  It's really neat.  But like you said, it's
>>> theoretically more efficient, and maybe a good programming practice,
>>> but I
>>> don't have any hard numbers..
>>>
>>> I just wanted to make sure to mention it early, because it might be
>>> one of
>>> those things that's easy to bake in, but a little hard to apply later (
>>> since it changes lots of apis )..
>>>
>>> But you can look at Google Collections.  They have some nice
>>> utilities under
>>> the Iterables class, for filtering or transforming the stream of
>>> objects in
>>> flight. :)
>>>      
>> Maybe you can provide an example how this might look like for the
>> forwarding case to many JIDs?
>> It could improve the API.
>>
>>   
>>> ps - But ultimately the memory footprint for most cases won't be an
>>> issue.
>>>   But I'm really interested in streamlining the code as much as possible
>>> because my use case is for sports fans following a live game ( play
>>> by play,
>>> chat, etc ), so there might be a large number of MUC room
>>> participants or
>>> PubSub subscribers (thousands? tens of thousands?) And the system simply
>>> can't load up full lists room participants or subscribers to be
>>> effective..
>>>      
>> I see. This is a valid concern. You'd need a stream instead of a list.
>>
>>   
>>> So I'm thinking into the future :)
>>>
>>> Heck I'm already pondering how you can make Vysper cluster aware.  It
>>> might
>>> be stable enough to start to bake in clustering/sharding concepts..
>>> not sure
>>> if anyone else would be interested in chatting on this. :)
>>>      
>> +1, that gets me very excited, I'd love to talk more about this.
>> (But please give new topics it's own thread on the list.)
>>
>>    Bernd
>>    
> 


Mime
View raw message