storm-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrea Gazzarini <gxs...@gmail.com>
Subject Re: Identifying the source of the memory error in Storm
Date Sun, 05 Feb 2017 13:40:51 GMT
Hi Navin,
I think this line is a good starting point for your analysis:

/"There is insufficient memory for the Java Runtime Environment to 
continue."

/I don't believe this scenario is caught by the JVM as a checked 
exception: in my opinion it belongs to the "Error" class, and that would 
explain why the catch block is never reached.
In addition, your assumption could be also right: the part of code that 
raises the exception could be everywhere in the worker code, not 
necessarily within your class; this because memory errors, differently 
from what in general happens for exceptions, don't have a deterministic 
point of failure, they depends on the system state at a given moment.

Please expand a bit (or investigate on yourself) your architecture, 
nodes, hardware resources and any information that can helps 
understanding your context. Tools like JVisualVM, JConsole, Storm GUI 
are precious friends in this contexts.

Best,
Andrea

On 05/02/17 12:53, Navin Ipe wrote:
> *Hi,
>
> *
> *I have a bolt which emits around 15000 tuples sometimes. Sometimes it 
> emits more than 20000 tuples. I think when this happens, there's a 
> memory issue and the workers get restarted. This is what 
> worker.log.err contains:*
> /
> Java HotSpot(TM) 64-Bit Server VM warning: INFO: 
> os::commit_memory(0x00000000f1000000, 62914560, 0) failed; 
> error='Cannot allocate memory' (errno=12)
> # There is insufficient memory for the Java Runtime Environment to 
> continue.
> # Native memory allocation (mmap) failed to map 62914560 bytes for 
> committing reserved memory.
> # An error report file with more information is saved as:
> # 
> /home/storm/apache-storm-1.0.0/storm-local/workers/6a1a70ad-d094-437a-a9c5-e837fc1b3535/hs_err_pid2766.log/
>
> *The odd part is, that in all my bolts I have */
>     @Override
>     public void execute(Tuple tuple) {
>         try {
> /
> /..some code; including the code that emits tuples
> /
> /} catch(Exception ex) {logger.info <http://logger.info>("The 
> exception {}, {}", ex.getCause(), ex.getMessage());}
>     }/
>
> *But in the logs I never see the string "The exception". But 
> worker.log shows:*
> /2017-02-05 09:14:01.320 STDERR [INFO] Java HotSpot(TM) 64-Bit Server 
> VM warning: INFO: os::commit_memory(0x00000000e6f80000, 37748736, 0) 
> failed; error='Cannot allocate memory' (errno=12)
> 2017-02-05 09:14:01.320 STDERR [INFO] #
> 2017-02-05 09:14:01.330 STDERR [INFO] # There is insufficient memory 
> for the Java Runtime Environment to continue.
> 2017-02-05 09:14:01.330 STDERR [INFO] # Native memory allocation 
> (mmap) failed to map 37748736 bytes for committing reserved memory.
> 2017-02-05 09:14:01.331 STDERR [INFO] # An error report file with more 
> information is saved as:
> 2017-02-05 09:14:01.331 STDERR [INFO] # 
> /home/storm/apache-storm-1.0.0/storm-local/workers/2685b445-c4a9-4f7e-94e1-1ce3fe13de47/hs_err_pid3022.log
> 2017-02-05 09:14:06.904 o.a.s.d.worker [INFO] Launching worker for 
> HydraCellGen-138-1486283223 on 
> 3fc3c05e-9769-4033-bf7d-df609d6c4963:6701 with id 
> 575bd7ed-a3fc-4f7f-a7d0-cdd4054c9fc5 and conf 
> {"topology.builtin.metrics.bucket.size.secs" 60, "nimbus.childopts" 
> "-Xmx1024m",... etc/
>
> *These are the settings I'm using for the topology:*/
>         Config stormConfig = new Config();
>         stormConfig.setNumWorkers(20);
>         stormConfig.setNumAckers(20);
>         stormConfig.put(Config.TOPOLOGY_DEBUG, false);
> stormConfig.put(Config.TOPOLOGY_TRANSFER_BUFFER_SIZE, 1024);
> stormConfig.put(Config.TOPOLOGY_EXECUTOR_RECEIVE_BUFFER_SIZE, 65536);
> stormConfig.put(Config.TOPOLOGY_EXECUTOR_SEND_BUFFER_SIZE, 65536);
>         stormConfig.put(Config.TOPOLOGY_MAX_SPOUT_PENDING, 2);
> stormConfig.put(Config.TOPOLOGY_MESSAGE_TIMEOUT_SECS, 2200);
>         stormConfig.put(Config.STORM_ZOOKEEPER_SERVERS, 
> Arrays.asList(new String[]{"localhost"}));
>         stormConfig.put(Config.TOPOLOGY_WORKER_CHILDOPTS, "-Xmx" + "2g");/
>
>
> *So am I right in assuming the exception is not thrown in my code but 
> is thrown in the worker thread? Do such exceptions happen when the 
> worker isn't able to receive too many tuples in its queue?
> *
> *What can I do to avoid this problem?*
>
> -- 
> Regards,
> Navin


Mime
View raw message