storm-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Roshan Naik <roshan_n...@yahoo.com>
Subject Re: Storm throughput
Date Sat, 31 Mar 2018 05:19:30 GMT
 
Something is definitely broken in your run or in your measurement method.... and its not your
hardware that is at fault. The machine on which those numbers were run had lots of cores but
the cores were not fast at all. Even my mid 2015 macbook pro has faster cores than that machine
which had old Intel CPUs.
You maybe making some mistakes in your calculations. Just run the topo for about 14 mins and
take the 10 min window reading directly from the UI and calculate the per sec throughput from
that. (that way you disregard the first 3 or 4mins to allow for warm up). Also are you overriding
any default settings ?

Here is the code for the topo that was used :  https://github.com/apache/storm/blob/1.1.x-branch/examples/storm-perf/src/main/java/org/apache/storm/perf/ConstSpoutOnlyTopo.java 


-roshan    On Friday, March 30, 2018, 8:24:39 AM PDT, Alessio Pagliari <pagliari@i3s.unice.fr>
wrote:  
 
 Surely they work on a way more powerful cluster, but the topology is composed by just one
spout. No parallelization, no bolts, for a total of one worker, so 1 thread in a jvm. Even
if I had 100 cores like them it shouldn't make any difference. Please, correct me if I'm wrong.

Such a topology will assign it's only spout to a worker in a node: so, the multi-node cluster
is pointless. Meanwhile, regarding the number of cores, one executor cannot be at the same
time on multiple cores, not being a multi-thread process. 

Is there some Storm or Java behavior that I'm not aware of?

Thank you,

Alessio

Sent from BlueMail On Mar 30, 2018, at 4:28 PM, Jacob Johansen <johansenjuwp@gmail.com>
wrote:
 for their test, they were using 4 worker nodes (servers) each with 24vCores for a total of
96vCores.  Most laptops max out at 8vCores and are typically at 4-6vCores   
   Jacob Johansen 
   
  On Fri, Mar 30, 2018 at 9:18 AM, Alessio Pagliari <pagliari@i3s.unice.fr> wrote: 
 
  Hi everybody,  
   I’m trying to do some preliminary tests with storm, to understand how far it can go.
Now I’m focusing on trying to understand which is his maximum throughput in terms of tuples
per second. I saw the benchmark done by the guys at Hortonworks (ref:  https://it.hortonworks.
com/blog/microbenchmarking- storm-1-0-performance/) and in the first test they reach a spout
emission rate of 3.2 million tuples/s.    
   I tried to replicate the test, a simple spout that emits continuously the same string “some
data”. Differently from them, I’m using Storm 1.1.1 and the storm cluster is set up on
my laptop, anyway I’m just testing one spout not an entire topology, but if you think that
more configuration information are needed, just ask.    
   To compute the throughput I ask the total amount of tuples processed to the UI APIs each
10s and I subtract it by the previous measure to have the amount of tuples int the last 10s.
What the mathematics give to me is something around 32k tuples/s.   
   I don’t think to be wrong saying that 32k is not even comparable to 3.2 million. Is there
something that I’m missing? Is it normal this output?   
   Thank you for your help and for your time,   
   Alessio   
  

  
Mime
View raw message