flume-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLUME-3217) Flume creates empty files when HDFS quota has been reached
Date Fri, 17 Aug 2018 11:00:00 GMT

    [ https://issues.apache.org/jira/browse/FLUME-3217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16583778#comment-16583778
] 

ASF GitHub Bot commented on FLUME-3217:
---------------------------------------

GitHub user ffernandez92 opened a pull request:

    https://github.com/apache/flume/pull/225

    FLUME-3217 . Creates empty files when quota

    As it can be seen in FLUME-3217, Flume creates empty files when HDFS quota has been reached.
    
    This is a first approach to the solution. It works although a better approach it could
be implemented.
    The idea is to capture the quota Exception in order to delete this annoying empty files
generated by flume.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/ffernandez92/flume patch-3

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/flume/pull/225.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #225
    
----
commit da08bed61fd91c2786a5e22d42eb265a8f3e76d3
Author: Ferran Fernández Garrido <ffernandez.upc@...>
Date:   2018-08-17T10:58:45Z

    FLUME-3217 . Creates empty files when quota
    
    As it can be seen in FLUME-3217, Flume creates empty files when HDFS quota has been reached.
    
    This is a first approach to the solution, it works although a better approach it could
be implemented. The idea is to capture the quota Exception in order to delete this annoying
empty files generated by flume.

----


> Flume creates empty files when HDFS quota has been reached
> ----------------------------------------------------------
>
>                 Key: FLUME-3217
>                 URL: https://issues.apache.org/jira/browse/FLUME-3217
>             Project: Flume
>          Issue Type: Bug
>          Components: Sinks+Sources
>    Affects Versions: 1.8.0
>            Reporter: Denes Arvay
>            Priority: Critical
>
> Flume creates empty files when HDFS quota has been reached and leaves them on HDFS. 
> The file creation was successful, but as long as the quota did not allow any write, new
file was created on every write attempt.
> Relevant error message from flume log:
>  {noformat}
> 2018-02-07 14:59:30,563 WARN org.apache.flume.sink.hdfs.BucketWriter: Caught IOException
writing to HDFSWriter (The DiskSpace quota of /data/catssolprn is exceeded: quota = 2199023255552
B = 2 TB but diskspace consumed = 2199217840800 B = 2.00 TB
> 	at org.apache.hadoop.hdfs.server.namenode.DirectoryWithQuotaFeature.verifyDiskspaceQuota(DirectoryWithQuotaFeature.java:149)
> 	at org.apache.hadoop.hdfs.server.namenode.DirectoryWithQuotaFeature.verifyQuota(DirectoryWithQuotaFeature.java:159)
> 	at org.apache.hadoop.hdfs.server.namenode.FSDirectory.verifyQuota(FSDirectory.java:2037)
> 	at org.apache.hadoop.hdfs.server.namenode.FSDirectory.updateCount(FSDirectory.java:1868)
> 	at org.apache.hadoop.hdfs.server.namenode.FSDirectory.updateCount(FSDirectory.java:1843)
> 	at org.apache.hadoop.hdfs.server.namenode.FSDirectory.addBlock(FSDirectory.java:441)
> 	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.saveAllocatedBlock(FSNamesystem.java:3806)
> 	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3394)
> 	at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:683)
> 	at org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.addBlock(AuthorizationProviderProxyClientProtocol.java:214)
> 	at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:495)
> 	at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
> 	at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
> 	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073)
> 	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2141)
> 	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2137)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at javax.security.auth.Subject.doAs(Subject.java:422)
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1840)
> 	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2135)
> ). Closing file (/user/flume/events/FlumeData.1518033562880.log.tmp) and rethrowing exception.

> {noformat}
> Config for reproduction:
> {noformat}
> tier1.sources = source1
> tier1.channels = channel1
> tier1.sinks    = sink1
> tier1.sources.source1.type     = netcat
> tier1.sources.source1.bind     = 127.0.0.1
> tier1.sources.source1.port     = 9999
> tier1.sources.source1.channels = channel1
> tier1.channels.channel1.type                = memory
> tier1.sinks.sink1.type= hdfs
> tier1.sinks.sink1.fileType=DataStream
> tier1.sinks.sink1.channel = channel1
> tier1.sinks.sink1.hdfs.path = hdfs://nameservice1/user/flume/events
> {noformat}
> hdfs dfs commands:
> {noformat}
> sudo -u flume hdfs dfs -mkdir -p /user/flume/events
> sudo -u hdfs hdfs dfsadmin -setSpaceQuota 3000 /user/flume/events
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@flume.apache.org
For additional commands, e-mail: issues-help@flume.apache.org


Mime
View raw message