storm-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Navin Ipe <navin....@searchlighthealth.com>
Subject Re: How to you store database connections in a Spout or Bolt without serialization problems?
Date Wed, 04 May 2016 12:00:02 GMT
Hmm...ok thanks. In this case I need to preserve state, so can't use
transient.
Anyway, I redesigned the classes to keep the connection strings elsewhere,
and now everything is working fine.
Thanks a lot!

On Wed, May 4, 2016 at 3:59 PM, Jungtaek Lim <kabhwan@gmail.com> wrote:

> Declare them as "class fields" but as transient (not mandatory) and
> initialize them in prepare() or open().
>
> Leaving it as uninitialized until prepare() or open() gets called doesn't
> make any issue because of lifecycle of task of Apache Storm.
>
> On Wednesday, May 4, 2016, Navin Ipe <navin.ipe@searchlighthealth.com>
> wrote:
>
>> Yes, I know they should be initialized in open() or prepare(). But I'm
>> referring to the declaration. If I do this:
>>
>>     @Override
>>     public void prepare(Map map, TopologyContext tc, OutputCollector oc)
>> {
>>         private Connection connRef;
>>         private Statement stmt;
>>         private ResultSet rs;
>>     }
>>
>> Then connRef, stmt and rs won't be available to execute() or nextTuple(),
>> right? So how to declare them to avoid the serialization error, is what I'm
>> asking.
>>
>> On Wed, May 4, 2016 at 3:34 PM, Sinnema, Remon <remon.sinnema@emc.com>
>> wrote:
>>
>>> Hi Navin,
>>>
>>>
>>>
>>> A DB connection is from one machine to another, how do you expect to
>>> share that between spouts and/or bolts that run on multiple machines? You
>>> should really set up the connection in open() or prepare(), so that it
>>> is specific to the machine that the spout or bolt runs on.
>>>
>>>
>>>
>>>
>>>
>>> Thanks,
>>>
>>> Ray
>>>
>>>
>>>
>>>
>>>
>>> *From:* Navin Ipe [mailto:navin.ipe@searchlighthealth.com]
>>> *Sent:* woensdag 4 mei 2016 11:48
>>> *To:* user@storm.apache.org
>>> *Subject:* How to you store database connections in a Spout or Bolt
>>> without serialization problems?
>>>
>>>
>>>
>>> Hi,
>>>
>>> I know that if a MySQL database connection is instantiated in the
>>> constructor of a Spout or Bolt, it won't work. It should be instantiated in
>>> open() or prepare().
>>>
>>> Problem is, when I store this database connection as a member of a class
>>> which is a member of a bolt. Eg:
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> *public class MongoIteratorBolt extends BaseRichBolt {     private
>>> S1Table s1; }     public class S1Table implements Serializable {
>>> private Connection connRef;     private Statement stmt;     private
>>> ResultSet rs;            public S1Table(Connection conn, final String
>>> tableName) {         try {             this.connRef = conn;
>>> this.stmt = conn.createStatement();            *
>>>
>>>
>>>
>>> I get an error like this:
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> *    8811 [main] ERROR o.a.s.s.o.a.z.s.NIOServerCnxnFactory - Thread
>>> Thread[main,5,main] died java.lang.IllegalStateException: Bolt 'mongoBolt'
>>> contains a non-serializable field of type
>>> com.mysql.jdbc.SingleByteCharsetConverter, which was instantiated prior to
>>> topology creation. com.mysql.jdbc.SingleByteCharsetConverter should be
>>> instantiated within the prepare method of 'mongoBolt at the earliest.
>>> at
>>> org.apache.storm.topology.TopologyBuilder.createTopology(TopologyBuilder.java:127)
>>> ~[MyStorm.jar:?]     at com.slh.Mystorm.MyStorm.main(MyStorm.java:76)
>>> ~[MyStorm.jar:?] Caused by: java.lang.RuntimeException:
>>> java.io.NotSerializableException: com.mysql.jdbc.SingleByteCharsetConverter
>>>     at org.apache.storm.utils.Utils.javaSerialize(Utils.java:167)
>>> ~[MyStorm.jar:?]     at
>>> org.apache.storm.topology.TopologyBuilder.createTopology(TopologyBuilder.java:122)
>>> ~[MyStorm.jar:?]     ... 1 more Caused by:
>>> java.io.NotSerializableException: com.mysql.jdbc.SingleByteCharsetConverter
>>>     at
>>> java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1184)
>>> ~[?:1.8.0_73]*
>>>
>>> I assume it is because of one of these which aren't getting serialized:
>>>
>>>
>>> *    private Connection connRef;     private Statement stmt;     private
>>> ResultSet rs;    *
>>>
>>> So if you can't declare them as class members because they don't get
>>> serialized, then how do you declare them so that the entire class will have
>>> access to it and I won't have to keep creating new connections for every
>>> query?
>>>
>>> I'm quite sure that declaring and initializing them in *prepare()*
>>> won't ensure that the rest of the class functions would be able to access
>>> it.
>>>
>>>
>>> --
>>>
>>> Regards,
>>>
>>> Navin
>>>
>>
>>
>>
>> --
>> Regards,
>> Navin
>>
>
>
> --
> Name : Jungtaek Lim
> Blog : http://medium.com/@heartsavior
> Twitter : http://twitter.com/heartsavior
> LinkedIn : http://www.linkedin.com/in/heartsavior
>
>


-- 
Regards,
Navin

Mime
View raw message