spot-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Christos Minas Mathas <>
Subject Spot-ml parameters configuration
Date Tue, 23 Jan 2018 16:35:48 GMT

I've been evaluating the Netflow component of Spot for quite some time 
now by using different kinds of attacks and collect the results. I'm 
using the default configuration, I haven't changed any of the parameters 
and the results I'm getting are not good. I was reading in the users 
mailing list some responses from Gustavo Lujan Moreno back in June 2017 
in which he said about the results they're getting:
//"On proxy we are getting > 0.90 on AUC and on net flow >0.99."/

My results in terms of AUROC are more like ~0.52 or worse.

He also gave some tips about configuring the parameters of spot-ml. So I 
thought I'd try them.

"/. . ."--ldamaxiterations 20” is the iteration parameter. You should 
change that 20 for something higher, at least 100, ideally +200.//
//. . .//
//If you are not getting good results the number of iterations and 
topics should be your priority./"

1. I changed ldamaxiterations to 200 but after running for ~60000 stages 
and 2 and a half hours there wasn't enough RAM in one of the associated 
VMs and ml_ops exited with a StackOverflowException. So I assigned 32GB 
of RAM to each one of the three VMs associated and this time it stopped 
at ~20000 stages again with a StackOverflow from another one of the 
associated VMs. How much RAM would I need for 200 iterations and for 
which services?

2. Can someone explain how can I properly configure the parameters of 
spot-ml? Like for the topic count for example, how can I calculate an 
approximate value of topics based on the traffic and the network setup?

If you need further information on my setup or the results I'm getting 
just let me know.

Thanks in advance

View raw message