flume-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "felix.wang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLUME-3308) TailDirSource will write an empty position json file
Date Wed, 10 Apr 2019 02:06:00 GMT

    [ https://issues.apache.org/jira/browse/FLUME-3308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16813989#comment-16813989

felix.wang commented on FLUME-3308:


exception with :
java.nio.file.FileSystemException: /data/log/td/s001/20190330: Too many open files
 if (!existingInodes.isEmpty())  will skip to witer.close()

   will write an empty position json file

> TailDirSource will write an empty position json file
> ----------------------------------------------------
>                 Key: FLUME-3308
>                 URL: https://issues.apache.org/jira/browse/FLUME-3308
>             Project: Flume
>          Issue Type: Bug
>          Components: Sinks+Sources
>    Affects Versions: 1.8.0
>         Environment: Flume 1.8, TailDirSource
>            Reporter: Xingxing Di
>            Priority: Critical
> We found TailDirSource has a critical issue  while writing position json file:
> 1.In the "process" method, existingInodes.clear() then  existingInodes.addAll() is
not  safe, in positionWriter thread will read an empty existingInodes  with no lock ( just
after clear() executed and addAll() method not called yet), this will cause an empty json
> 2.The FileWriter is not atomic, the position json file is over 5M in our case(which is
big), after we fix the above issue, we still read an empty position json file occasionally.
> If flume was restarted and read an empty position json file, flume will tail all files
from begining, which is critical!
> So we make a little change : 
>  # Add lock for existingInodes list,  and in positionWriter thread  we make a copy
of existingInodes everytime.
>  # We replace FileWriter with an AtomicFileWriter.
> Later i will make a PR to share this.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: issues-unsubscribe@flume.apache.org
For additional commands, e-mail: issues-help@flume.apache.org

View raw message