flume-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLUME-2973) Deadlock in hdfs sink
Date Tue, 28 Aug 2018 15:07:01 GMT

    [ https://issues.apache.org/jira/browse/FLUME-2973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16595109#comment-16595109
] 

ASF GitHub Bot commented on FLUME-2973:
---------------------------------------

GitHub user majorendre opened a pull request:

    https://github.com/apache/flume/pull/226

    FLUME-2973 BucketWriter deadlock fix

    This PR is based on Yan Jian's fix and his test improvements. Also contains the deadlock
reproduction contributed by @adenes.
    I have made minimal changes to those contributions.
    Denes's test was used for checking the fix.
    Yan's fix contains an optimization as it first calls the callback function that removes
the BucketWriter from the cache. This is useful, should help to avoid some errors.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/majorendre/flume FLUME-2973

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/flume/pull/226.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #226
    
----

----


> Deadlock in hdfs sink
> ---------------------
>
>                 Key: FLUME-2973
>                 URL: https://issues.apache.org/jira/browse/FLUME-2973
>             Project: Flume
>          Issue Type: Bug
>          Components: Sinks+Sources
>    Affects Versions: 1.7.0
>            Reporter: Denes Arvay
>            Assignee: Denes Arvay
>            Priority: Critical
>              Labels: hdfssink
>             Fix For: 1.9.0
>
>         Attachments: FLUME-2973-1.patch, FLUME-2973-min2.patch, FLUME-2973.patch
>
>
> Automatic close of BucketWriters (when open file count reached {{hdfs.maxOpenFiles}})
and the file rolling thread can end up in deadlock.
> When creating a new {{BucketWriter}} in {{HDFSEventSink}} it locks {{HDFSEventSink.sfWritersLock}}
and the {{close()}} called in {{HDFSEventSink.sfWritersLock.removeEldestEntry}} tries to lock
the {{BucketWriter}} instance.
> On the other hand if the file is being rolled in {{BucketWriter.close(boolean)}} it locks
the {{BucketWriter}} instance first and in the close callback it tries to lock the {{sfWritersLock}}.
> The chances for this deadlock is higher when the {{hdfs.maxOpenFiles}}'s value is low
(1).
> Script to reproduce: https://gist.github.com/adenes/96503a6e737f9604ab3ee9397a5809ff
> (put to {{flume-ng-sinks/flume-hdfs-sink/src/main/java/org/apache/flume/sink/hdfs}})
> Deadlock usually occurs before ~30 iterations.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@flume.apache.org
For additional commands, e-mail: issues-help@flume.apache.org


Mime
View raw message