flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Till Rohrmann <trohrm...@apache.org>
Subject Re: Runtime generated (source) datasets
Date Wed, 21 Jan 2015 12:55:18 GMT
Hi Flavio,

if your question was whether you can write a Flink job which can read input
from different sources, depending on the user input, then the answer is
yes. The Flink job plans are actually generated at runtime so that you can
easily write a method which generates a user dependent input/data set.

You could do something like this:

DataSet<ElementType> getInput(String[] args, ExecutionEnvironment env) {
  if(args[0] == csv) {
    return env.readCsvFile(...);
  } else {
    return env.createInput(new AvroInputFormat<ElementType>(...));

as long as the element type of the data set are all equal for all possible
data sources. I hope that I understood your problem correctly.



On Wed, Jan 21, 2015 at 11:45 AM, Flavio Pompermaier <pompermaier@okkam.it>

> Hi guys,
> I have a big question for you about how Fling handles job's plan
> generation:
> let's suppose that I want to write a job that takes as input a description
> of a set of datasets that I want to work on (for example a csv file and its
> path, 2 hbase tables, 1 parquet directory and its path, etc).
> From what I know Flink generates the job's plan at compile time, so I was
> wondering whether this is possible right now or not..
> Thanks in advance,
> Flavio

View raw message