nifi-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Erik Anderson" <eand...@pobox.com>
Subject NiFi kubectl for launching container jobs
Date Fri, 28 Jun 2019 13:10:28 GMT
I have heard about NiFi-Fn (B23 Kubernetes Operator for NiFi-Fn)

Has anyone built a NiFi kubectl processor and possibly a nice NiFi "remote jobs" base docker
container that can be used to control a remote nifi processor/job that conforms to Apache
NiFi input and output mechanisms (flow file format)?

I know we would need a way to marshal the NiFi flowfile format in and out of a container,
but if we did we can launch remote Python processes that scale well via using cloud native
mechanisms (DevOps).

We built a native Python 2.7/3.7 NiFi processor that allows you to quickly chain together
Java and Python flows. This is powerful because most data infrastructure is in python, not
Java, especially Geospatial data. Of course this wont scale because of the number of Python
processors that can potentially run on a NiFi node, but it allows you to quickly get things
working. 2 days and you can do some amazing things.

If I can now offload that Python processing, via Kubernetes kubectl, we can use automated
DevOps scaling for some really large jobs. Possibly using a NiFi processor that wraps https://github.com/kubernetes-client/java

Why all this jazz?
Real Use Case: Geospatial data (GeoJSON, ESRI Shapefile, etc). It requires standard python
"pip install blah-blah" packages to process it.

Thoughts? Please throw tomatoes at the idea. I welcome constructive and destructive criticism
because that means people care.

Erik Anderson
Bloomberg


Mime
View raw message