storm-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nathan Leung <ncle...@gmail.com>
Subject Re: Does Storm use Netty to access a class reference across worker JVM's?
Date Fri, 22 Apr 2016 10:59:39 GMT
To clarify my point is that config is NOT suitable for passing database
configuration. It can be used to pass database connection configuration
though (eg host/port values).
On Apr 22, 2016 6:41 AM, "Navin Ipe" <navin.ipe@searchlighthealth.com>
wrote:

> Thank you very much for your time and help, Nathan and John.
> Followed up on serialization, and it looks like everything can be
> serialized: http://stackoverflow.com/a/16851174/453673
> Will verify the database connection serialization also during
> implementation.
>
> On Thu, Apr 21, 2016 at 5:22 PM, Nathan Leung <ncleung@gmail.com> wrote:
>
>> mongoManager is serialized and sent to your spout.  If it's not something
>> that's easily serializable (e.g. a database connection) then you will need
>> to initialize it in spout prepare() instead of the constructor.
>>
>> On Thu, Apr 21, 2016 at 7:34 AM, Navin Ipe <
>> navin.ipe@searchlighthealth.com> wrote:
>>
>>> Thanks John, but that's odd...in the code I shared, there's a reference
>>> to mongoManager being used in the Spout (the MongoSpout internally stores a
>>> reference to mongoManager). If there are no object references shared
>>> between executors, then when the topology I created is submitted to Storm,
>>> would Storm serialize or clone and store an instance of mongoManager (and
>>> all initialized values inside it) inside the Spout? Storm would surely have
>>> to do *something *to ensure that references aren't cut off when workers
>>> operate in different JVM's...
>>>
>>> On Thu, Apr 21, 2016 at 4:45 PM, <hokiegeek2@gmail.com> wrote:
>>>
>>>> Netty is used for communication between workers and then the LMAX
>>>> disruptor queue is used to route messages between Netty and the individual
>>>> executors such as the MongoSpout and KafkaBolt. AFAIK, there are not direct
>>>> object references shared between executors because all executors
>>>> communicate via Netty/LMAX (shuffle/fieldsGrouping) or LMAX
>>>> (localIrShuffleGrouping).
>>>>
>>>> --John
>>>>
>>>> Sent from my iPhone
>>>>
>>>> On Apr 21, 2016, at 1:29 AM, Navin Ipe <navin.ipe@searchlighthealth.com>
>>>> wrote:
>>>>
>>>> In the below code,
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> *public static void main(String[] cmdArgs) {Config config = new
>>>> Config();config.setNumWorkers(5);        MongoManager mongoManager = new
>>>> MongoManager();TopologyBuilder builder = new
>>>> TopologyBuilder();builder.setSpout("someSpout", new
>>>> MongoSpout(mongoManger)));}*
>>>>
>>>> Assuming there are many more spouts and blots created, I understand
>>>> that each worker will run in its own JVM
>>>> <http://www.michael-noll.com/blog/2013/06/21/understanding-storm-internal-message-buffers/>,
>>>> which means that it will have its own memory space.
>>>>
>>>> *Questions:*
>>>> *1.* So when the mongoManager reference is passed to MongoSpout, will
>>>> MongoSpout always be able to access the initialized members of mongoManager?
>>>> *2.* Isn't it likely that main() runs in a different JVM and a
>>>> MongoSpout will be in another JVM? How would Storm access mongoManager?
>>>> Using Netty?
>>>> *3.* (optional help) I have the Storm source code. Could anyone point
>>>> me to the part that Storm does the inter-worker communication for accessing
>>>> class references?
>>>>
>>>> --
>>>> Regards,
>>>> Navin
>>>>
>>>>
>>>
>>>
>>> --
>>> Regards,
>>> Navin
>>>
>>
>>
>
>
> --
> Regards,
> Navin
>

Mime
View raw message