spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Max Xu (JIRA)" <j...@apache.org>
Subject [jira] [Created] (SPARK-5147) wrtie ahead logs from streaming receiver are not purged because cleanupOldBlocks in WriteAheadLogBasedBlockHandler is never called
Date Thu, 08 Jan 2015 16:09:34 GMT
Max Xu created SPARK-5147:
-----------------------------

             Summary: wrtie ahead logs from streaming receiver are not purged because cleanupOldBlocks
in WriteAheadLogBasedBlockHandler is never called
                 Key: SPARK-5147
                 URL: https://issues.apache.org/jira/browse/SPARK-5147
             Project: Spark
          Issue Type: Bug
          Components: Streaming
    Affects Versions: 1.2.0
            Reporter: Max Xu


Hi all,

We are running a Spark streaming application with ReliableKafkaReceiver. We have "spark.streaming.receiver.writeAheadLog.enable"
set to true so write ahead logs (WALs) for received data are created under receivedData/streamId
folder in the checkpoint directory. 

However, old WALs are never purged by time. receivedBlockMetadata and checkpoint files are
purged correctly though. I went through the code, WriteAheadLogBasedBlockHandler class in
ReceivedBlockHandler.scala is responsible for cleaning up the old blocks. It has method cleanupOldBlocks,
which is never called by any class. ReceiverSupervisorImpl class holds a WriteAheadLogBasedBlockHandler
 instance. However, it only calls storeBlock method to create WALs but never calls cleanupOldBlocks
method to purge old WALs.

The size of the WAL folder increases constantly on HDFS. This is preventing us from running
the ReliableKafkaReceiver 24x7. Can somebody please take a look.

Thanks,
Max



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message