samza-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Lukas Steiblys" <lu...@doubledutch.me>
Subject Multithreading ThreadJobFactory
Date Mon, 19 Oct 2015 22:49:17 GMT
I have been thinking lately about the most non-invasive way to add multithreading capabilities
to ThreadJobFactory, as that is the main method we run our jobs in production. Looking at
the master branch code in Git, I have found the following:
  a.. The best way would be to simply spin up a new thread for each container. 
  b.. The number of containers can already be specified using the configuration property job.container.count.

  c.. I can construct a new SamzaContainer for each containerModel returned from coordinator.jobModel.getContainers
in ThreadJobFactory. 
  d.. I can pass a list of these containers into ThreadJob constructor modifying it to accept
an array of Runnables. 
  e.. For each runnable, it would create a new thread and start it in the submit method of
ThreadJob.
This should start up a new thread for each container and group the tasks using the appropriate
TaskNameGrouper.

Any ideas on what I might have missed? Are there any other potential solutions? Would this
be a good patch for Samza in general?

Lukas

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message