flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bill Sparks <jspa...@cray.com>
Subject RE: data conversion between flink and "other" paradigms
Date Mon, 06 Jul 2015 08:28:16 GMT

Thanks for the info and pointer to python. I'll check it out.


From: Fabian Hueske [fhueske@gmail.com]
Sent: Monday, July 06, 2015 3:23 AM
To: user@flink.apache.org
Subject: Re: data conversion between flink and "other" paradigms

Hi Bill,

a DataSet is just a logical concept in Flink. DataSets are often not persisted and just streamed
along operators. At the moment, there is no way to access an intermediate DataSet of a Flink
program directly (this might change in the future).

You can process data in another function by implementing a Java user function (for example
a MapPartition function) and sending the data through JNI to a C function (if you need the
full data set, you must set the parallelism to 1). Flink's Python API follows a similar approach
to ship data from Flink to an external Python process.

Best, Fabian

2015-07-06 9:30 GMT+02:00 Bill Sparks <jsparks@cray.com<mailto:jsparks@cray.com>>:

Just a question if there was some prior-art here. Just say someone wanted to use flink for
processing, but at some point they wanted to call another function via say JNI/C which doesn't
understand DataSet's. How would one go about this ... I'm assuming the code would have to
convert the data to a common format prior to calling the function.



View raw message