flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Maximilian Michels <...@apache.org>
Subject Re: writeAsCsv not writing anything on HDFS when WriteMode set to OVERWRITE
Date Thu, 02 Jul 2015 08:20:51 GMT
Hi Mihail,

Thanks for the code. I'm trying to reproduce the problem now.

On Wed, Jul 1, 2015 at 8:30 PM, Mihail Vieru <vieru@informatik.hu-berlin.de>
wrote:

>  Hi Max,
>
> thank you for your reply. I wanted to revise and dismiss all other factors
> before writing back. I've attached you my code and sample input data.
>
> I run the *APSPNaiveJob* using the following arguments:
>
> *0 100 hdfs://path/to/vertices-test-100 hdfs://path/to/edges-test-100
> hdfs://path/to/tempgraph 10 0.5 hdfs://path/to/output-apsp 9*
>
> I was wrong, I originally thought that the first writeAsCsv call (line 50)
> doesn't work. An exception is thrown without the WriteMode.OVERWRITE when
> the file exists.
>
> But the problem lies with the second call (line 74), trying to write to
> the same path on HDFS.
>
> This issue is blocking me, because I need to persist the vertices dataset
> between iterations.
>
> Cheers,
> Mihail
>
> P.S.: I'm using the latest 0.10-SNAPSHOT and HDFS 1.2.1.
>
>
>
> On 30.06.2015 16:51, Maximilian Michels wrote:
>
>   HI Mihail,
>
>  Thank you for your question. Do you have a short example that reproduces
> the problem? It is hard to find the cause without an error message or some
> example code.
>
>  I wonder how your loop works without WriteMode.OVERWRITE because it
> should throw an exception in this case. Or do you change the file names on
> every write?
>
>  Cheers,
>  Max
>
> On Tue, Jun 30, 2015 at 3:47 PM, Mihail Vieru <
> vieru@informatik.hu-berlin.de> wrote:
>
>>  I think my problem is related to a loop in my job.
>>
>> Before the loop, the writeAsCsv method works fine, even in overwrite mode.
>>
>> In the loop, in the first iteration, it writes an empty folder containing
>> empty files to HDFS. Even though the DataSet it is supposed to write
>> contains elements.
>>
>> Needless to say, this doesn't occur in a local execution environment,
>> when writing to the local file system.
>>
>>
>> I would appreciate any input on this.
>>
>> Best,
>> Mihail
>>
>>
>>
>> On 30.06.2015 12:10, Mihail Vieru wrote:
>>
>> Hi Till,
>>
>> thank you for your reply.
>>
>> I have the following code snippet:
>>
>> *intermediateGraph.getVertices().writeAsCsv(tempGraphOutputPath, "\n",
>> ";", WriteMode.OVERWRITE);*
>>
>> When I remove the WriteMode parameter, it works. So I can reason that the
>> DataSet contains data elements.
>>
>> Cheers,
>> Mihail
>>
>>
>> On 30.06.2015 12:06, Till Rohrmann wrote:
>>
>>  Hi Mihail,
>>
>> have you checked that the DataSet you want to write to HDFS actually
>> contains data elements? You can try calling collect which retrieves the
>> data to your client to see what’s in there.
>>
>> Cheers,
>> Till
>> ​
>>
>> On Tue, Jun 30, 2015 at 12:01 PM, Mihail Vieru <
>> vieru@informatik.hu-berlin.de> wrote:
>>
>>> Hi,
>>>
>>> the writeAsCsv method is not writing anything to HDFS (version 1.2.1)
>>> when the WriteMode is set to OVERWRITE.
>>> A file is created but it's empty. And no trace of errors in the Flink or
>>> Hadoop logs on all nodes in the cluster.
>>>
>>> What could cause this issue? I really really need this feature..
>>>
>>> Best,
>>> Mihail
>>>
>>
>>
>>
>>
>
>

Mime
View raw message