storm-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ashu Goel <>
Subject Re: Storm with Python
Date Fri, 30 May 2014 23:59:53 GMT

>From what I understand streamparse still requires that the topologies be in Clojure…
not entirely sure how this is different from what storm already provides. I was looking more
for a DSL that we could use w/ Python 2.6 and be 100% Python, but it looks like that is not

On May 30, 2014, at 2:14 PM, Andrew Montalenti <<>>

For one thing, a recently accepted Storm pull request has made this serialization pluggable
and someone has already implemented a protobuf variety. We plan to investigate alternative
serialization options for multilang once we get the other tooling out of the way.

For another, it is true the overhead for serialization is non trivial, but the overhead also
tends to be a constant factor applied to data size, and machines are cheap while programming
time is expensive. Storm and Python's data analysis and data integration libraries are a pretty
powerful combo worth the performance penalty.

On May 30, 2014 1:42 PM, "Larry Palmer" <<>>
We had experimented with Storm/Python 6 months ago or so, but found the JSON serialization/deserialization
overhead was quite high, on the order of several hundred usec per tuple every time it transitioned
from java to python or vice versa, limiting total throughput on a 12 core server to around
25k tuples/second. Considered trying to switch to a different serializer but ended up just
doing everything in Java instead.

Is that still the case, or perhaps has the speed been improved?

On Thu, May 29, 2014 at 10:06 PM, Andrew Montalenti <<>>

We are building a new Storm and Python interop option that is called streamparse:

It includes a heavily rewritten Storm interop library and a command line tool, sparse, for
managing local and remote Storm clusters. The idea is to make Storm projects as easy to build
and manage in Python as RQ or Celery projects.

It currently has support for running local clusters in a single command, managing virtualenvs
on remote worker machines, submitting topologies, listing/killing topologies, and tailing
remote log files. The multilang layer also has better support for logging and exception/error
handling. Multiple topologies can be built from a single codebase and multiple remote Storm
clusters can be supported via a simple JSON configuration file.

We are already using it for production topologies atop Storm 0.9.1 and Storm 0.8. We welcome
contributions and if you join our mailing list, feel free to make requests. We continue to
develop it actively and in an open manner.

-Andrew Montalenti

On May 29, 2014 6:35 PM, "Ashu Goel" <<>>
(the reason being is that we are still running Python 2.6 but Petrel is only compatible with
On May 29, 2014, at 2:48 PM, Ashu Goel <<>>

Awesome! I’m looking more into using the storm.thrift to define a non-JVM DSL… does anyone
have any working examples of this? Python preferred but any example will do. the wiki is a
bit confusing...
On May 28, 2014, at 1:54 PM, FRANCISCO JESUS GOMEZ RODRIGUEZ <<>>

Ashu, take a look this project:

Write, submit, debug and monitor in python.


El 28/05/2014 22:49, Ashu Goel <<>> escribió:
Any examples where the entire infra is written in Python (including topology)? or is that
not possible
On May 28, 2014, at 1:33 PM, Dilpreet Singh <<>>

The WordCountTopology contains an example python bolt.


On Thu, May 29, 2014 at 1:59 AM, Ashu Goel <<>>
Does anyone have a good example program/instructions of using Python with storm? I can’t
seem to find anything concrete online.

Ashu Goel


Este mensaje y sus adjuntos se dirigen exclusivamente a su destinatario, puede contener información
privilegiada o confidencial y es para uso exclusivo de la persona o entidad de destino. Si
no es usted. el destinatario indicado, queda notificado de que la lectura, utilización, divulgación
y/o copia sin autorización puede estar prohibida en virtud de la legislación vigente. Si
ha recibido este mensaje por error, le rogamos que nos lo comunique inmediatamente por esta
misma vía y proceda a su destrucción.

The information contained in this transmission is privileged and confidential information
intended only for the use of the individual or entity named above. If the reader of this message
is not the intended recipient, you are hereby notified that any dissemination, distribution
or copying of this communication is strictly prohibited. If you have received this transmission
in error, do not read it. Please immediately reply to the sender that you have received this
communication in error and then delete it.

Esta mensagem e seus anexos se dirigem exclusivamente ao seu destinatário, pode conter informação
privilegiada ou confidencial e é para uso exclusivo da pessoa ou entidade de destino. Se
não é vossa senhoria o destinatário indicado, fica notificado de que a leitura, utilização,
divulgação e/ou cópia sem autorização pode estar proibida em virtude da legislação
vigente. Se recebeu esta mensagem por erro, rogamos-lhe que nos o comunique imediatamente
por esta mesma via e proceda a sua destruição

View raw message