flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Flavio Pompermaier <pomperma...@okkam.it>
Subject Re: WriteAsText bug or bad name?
Date Mon, 03 Nov 2014 10:58:17 GMT
Nope. This is actually a bug for me, I don't know what the FLINK community
or committee think

On Mon, Nov 3, 2014 at 11:52 AM, Fabian Hueske <fhueske@apache.org> wrote:

> Hi Flavio,
>
> any updates on this bug?
>
> Thanks, Fabian
>
> 2014-10-29 22:36 GMT+01:00 Fabian Hueske <fhueske@apache.org>:
>
>> Regarding the text vs. sequence output.
>> writeAsText() emits each record using its toString() method, which should
>> be the String itself in your case.
>>
>> So if it would write binary data, something is wrong...
>>
>>
>> 2014-10-29 22:34 GMT+01:00 Fabian Hueske <fhueske@apache.org>:
>>
>>> You can set the DOP of the data sink to 1 [1].
>>> There is also a config parameter whether to create a directory or not in
>>> case of DOP=1. If I remember correctly, the default is to NOT create
>>> a folder for DOP=1.
>>>
>>> [1]
>>> http://flink.incubator.apache.org/docs/0.7-incubating/programming_guide.html#parallel-execution
>>>
>>> Best, Fabian
>>>
>>> 2014-10-29 22:22 GMT+01:00 Flavio Pompermaier <pompermaier@okkam.it>:
>>>
>>>> Would it be that difficult to change the behaviour for file:/// and
>>>> create a single file?or is there a way to do that?
>>>> On Oct 29, 2014 9:52 PM, "Márton Balassi" <balassi.marton@gmail.com>
>>>> wrote:
>>>>
>>>>> Dear Flavio,
>>>>>
>>>>> Yes, the writeAsText() merthod really creates a folder which contains
>>>>> a file for each execution thread, so your threads do not block each other
>>>>> and the execution can use multiple cores on your machine. You can see
>>>>> similar results if you try it with env.execute() from an IDE.
>>>>>
>>>>> There are filesystems, HDFS to mention the most prominent one which
>>>>> can transparently treat such folder structure as a single file and then
it
>>>>> would behave as you expect. I hope this answers your question.
>>>>>
>>>>> Best,
>>>>>
>>>>> Marton
>>>>>
>>>>> On Wed, Oct 29, 2014 at 8:31 PM, Flavio Pompermaier <
>>>>> pompermaier@okkam.it> wrote:
>>>>>
>>>>>> Hi to all,
>>>>>> running the example at
>>>>>> http://flink.incubator.apache.org/docs/0.7-incubating/local_execution.html
>>>>>> I was thinking that the writeAsText on a local file was creating
a text
>>>>>> file on my local filesystem..instead it creates something similar
to a
>>>>>> sequence file (within a folder).
>>>>>> This is something misleading I think...or the API name is wrong or
>>>>>> this is a bug (IMHO).
>>>>>> Btw..how can I modify the following program to write results in a
>>>>>> single text file on my local filesystem?
>>>>>>
>>>>>> public static void main(String[] args) throws Exception {
>>>>>>  ExecutionEnvironment env = ExecutionEnvironment.createLocalEnvironment();
>>>>>>  DataSet<String> data = env.readTextFile("file:///tmp/res.txt");
>>>>>>  data.filter(new FilterFunction<String>() {
>>>>>>    public boolean filter(String value) {
>>>>>>     return value.startsWith("http://");
>>>>>>    }
>>>>>>   }).writeAsText("file:///tmp/res.txt");
>>>>>>   env.execute();}
>>>>>>
>>>>>> Best,
>>>>>> Flavio
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>
>>
>

Mime
View raw message