samza-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Lukas Steiblys" <>
Subject Samza Memory Usage on YARN
Date Thu, 18 Sep 2014 17:13:08 GMT

I’m trying to use Samza for our new data processing pipeline using YARN for job scheduling
and I’ve noticed that it consumes an incredibly large amount of memory. Running the Application
Master, that should be a very lightweight application in my opinion, consumes around ~1.4GB
of virtual memory and ~200MB of physical memory. Same goes for the actual tasks.

Is this behavior common or could this be some misconfiguration? As I understand, one of the
problems is that each container has it’s own VM instance and has to load all the libraries.
Could there be some other issues? Maybe it’s possible to actually split the application
master package from the task package so it’s more lightweight?

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message