spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From on <>
Subject Performance tuning for standalone on one host
Date Mon, 25 Jul 2016 17:21:46 GMT
Dear all,

I am running spark on one host ("local[2]") doing calculations like this
on a socket stream.
mainStream = socketStream.filter(lambda msg:
msg['header'].startswith('test')).map(lambda x: (x['host'], x) )
s1 = mainStream.updateStateByKey(updateFirst).map(lambda x: (1, x) )
s2 = mainStream.updateStateByKey(updateSecond,
initialRDD=initialMachineStates).map(lambda x: (2, x) )

I evaluated each calculations allone has a processing time about 400ms
but processing time of the code above is over 3 sec on average.

I know there are a lot of parameters unknown but does anybody has hints
how to tune this code / system? I already changed a lot of parameters,
such as #executors, #cores and so on.

Thanks in advance and best regards,

To unsubscribe e-mail:

View raw message