storm-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jungtaek Lim <kabh...@gmail.com>
Subject Re: How to you store database connections in a Spout or Bolt without serialization problems?
Date Wed, 04 May 2016 10:29:29 GMT
Declare them as "class fields" but as transient (not mandatory) and
initialize them in prepare() or open().

Leaving it as uninitialized until prepare() or open() gets called doesn't
make any issue because of lifecycle of task of Apache Storm.

On Wednesday, May 4, 2016, Navin Ipe <navin.ipe@searchlighthealth.com>
wrote:

> Yes, I know they should be initialized in open() or prepare(). But I'm
> referring to the declaration. If I do this:
>
>     @Override
>     public void prepare(Map map, TopologyContext tc, OutputCollector oc)
> {
>         private Connection connRef;
>         private Statement stmt;
>         private ResultSet rs;
>     }
>
> Then connRef, stmt and rs won't be available to execute() or nextTuple(),
> right? So how to declare them to avoid the serialization error, is what I'm
> asking.
>
> On Wed, May 4, 2016 at 3:34 PM, Sinnema, Remon <remon.sinnema@emc.com
> <javascript:_e(%7B%7D,'cvml','remon.sinnema@emc.com');>> wrote:
>
>> Hi Navin,
>>
>>
>>
>> A DB connection is from one machine to another, how do you expect to
>> share that between spouts and/or bolts that run on multiple machines? You
>> should really set up the connection in open() or prepare(), so that it
>> is specific to the machine that the spout or bolt runs on.
>>
>>
>>
>>
>>
>> Thanks,
>>
>> Ray
>>
>>
>>
>>
>>
>> *From:* Navin Ipe [mailto:navin.ipe@searchlighthealth.com
>> <javascript:_e(%7B%7D,'cvml','navin.ipe@searchlighthealth.com');>]
>> *Sent:* woensdag 4 mei 2016 11:48
>> *To:* user@storm.apache.org
>> <javascript:_e(%7B%7D,'cvml','user@storm.apache.org');>
>> *Subject:* How to you store database connections in a Spout or Bolt
>> without serialization problems?
>>
>>
>>
>> Hi,
>>
>> I know that if a MySQL database connection is instantiated in the
>> constructor of a Spout or Bolt, it won't work. It should be instantiated in
>> open() or prepare().
>>
>> Problem is, when I store this database connection as a member of a class
>> which is a member of a bolt. Eg:
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> *public class MongoIteratorBolt extends BaseRichBolt {     private
>> S1Table s1; }     public class S1Table implements Serializable {
>> private Connection connRef;     private Statement stmt;     private
>> ResultSet rs;            public S1Table(Connection conn, final String
>> tableName) {         try {             this.connRef = conn;
>> this.stmt = conn.createStatement();            *
>>
>>
>>
>> I get an error like this:
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> *    8811 [main] ERROR o.a.s.s.o.a.z.s.NIOServerCnxnFactory - Thread
>> Thread[main,5,main] died java.lang.IllegalStateException: Bolt 'mongoBolt'
>> contains a non-serializable field of type
>> com.mysql.jdbc.SingleByteCharsetConverter, which was instantiated prior to
>> topology creation. com.mysql.jdbc.SingleByteCharsetConverter should be
>> instantiated within the prepare method of 'mongoBolt at the earliest.
>> at
>> org.apache.storm.topology.TopologyBuilder.createTopology(TopologyBuilder.java:127)
>> ~[MyStorm.jar:?]     at com.slh.Mystorm.MyStorm.main(MyStorm.java:76)
>> ~[MyStorm.jar:?] Caused by: java.lang.RuntimeException:
>> java.io.NotSerializableException: com.mysql.jdbc.SingleByteCharsetConverter
>>     at org.apache.storm.utils.Utils.javaSerialize(Utils.java:167)
>> ~[MyStorm.jar:?]     at
>> org.apache.storm.topology.TopologyBuilder.createTopology(TopologyBuilder.java:122)
>> ~[MyStorm.jar:?]     ... 1 more Caused by:
>> java.io.NotSerializableException: com.mysql.jdbc.SingleByteCharsetConverter
>>     at
>> java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1184)
>> ~[?:1.8.0_73]*
>>
>> I assume it is because of one of these which aren't getting serialized:
>>
>>
>> *    private Connection connRef;     private Statement stmt;     private
>> ResultSet rs;    *
>>
>> So if you can't declare them as class members because they don't get
>> serialized, then how do you declare them so that the entire class will have
>> access to it and I won't have to keep creating new connections for every
>> query?
>>
>> I'm quite sure that declaring and initializing them in *prepare()* won't
>> ensure that the rest of the class functions would be able to access it.
>>
>>
>> --
>>
>> Regards,
>>
>> Navin
>>
>
>
>
> --
> Regards,
> Navin
>


-- 
Name : Jungtaek Lim
Blog : http://medium.com/@heartsavior
Twitter : http://twitter.com/heartsavior
LinkedIn : http://www.linkedin.com/in/heartsavior

Mime
View raw message