jackrabbit-oak-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Thomas Mueller (Jira)" <j...@apache.org>
Subject [jira] [Created] (OAK-8991) MarkSweepGarbageCollector: repeated warnings for files that don't exist
Date Thu, 02 Apr 2020 13:16:00 GMT
Thomas Mueller created OAK-8991:

             Summary: MarkSweepGarbageCollector: repeated warnings for files that don't exist
                 Key: OAK-8991
                 URL: https://issues.apache.org/jira/browse/OAK-8991
             Project: Jackrabbit Oak
          Issue Type: Improvement
          Components: blob
            Reporter: Thomas Mueller

When using the MarkSweepGarbageCollector (using for example a file data store), if the blob
id file (from the BlobIdTracker) contains records that don't exist in the datastore, then
a warning is logged when trying to remove the (unreferenced) file:

*WARN* org.apache.jackrabbit.oak.plugins.blob.MarkSweepGarbageCollector Error occurred while
deleting blob with id [...]
org.apache.jackrabbit.core.data.DataStoreException: Record ... does not exist
	at org.apache.jackrabbit.core.data.AbstractDataStore.getRecord(AbstractDataStore.java:59)
	at org.apache.jackrabbit.oak.plugins.blob.datastore.OakFileDataStore.getRecordForId(OakFileDataStore.java:259)
	at org.apache.jackrabbit.oak.plugins.blob.datastore.DataStoreBlobStore.getRecordForId(DataStoreBlobStore.java:520)
	at org.apache.jackrabbit.oak.plugins.blob.datastore.DataStoreBlobStore.countDeleteChunks(DataStoreBlobStore.java:426)
	at org.apache.jackrabbit.oak.plugins.blob.MarkSweepGarbageCollector$BlobCollectionType.sweepInternal(MarkSweepGarbageCollector.java:859)
	at org.apache.jackrabbit.oak.plugins.blob.MarkSweepGarbageCollector.sweep(MarkSweepGarbageCollector.java:423)
	at org.apache.jackrabbit.oak.plugins.blob.MarkSweepGarbageCollector.markAndSweep(MarkSweepGarbageCollector.java:287)
	at org.apache.jackrabbit.oak.plugins.blob.MarkSweepGarbageCollector.collectGarbage(MarkSweepGarbageCollector.java:194)

That means it tried to remove a file that doesn't exist.
This indicates a problem in the process; for example, the blob id tracker file(s) was/were
restored from an older backup. (Possibly there are other cases how this could occur).

Now, the next time the garbage collection is run, the same files will try to be removed, and
that again fails.

It would be better if the files that don't exist are removed from the blob id tracker file,
so that they are not tried to be removed later again and again.

If the blob id tracker file(s) are incorrect, I think it would be better to delete and rebuild
them, otherwise some of the unreferenced binaries will never be removed. Possibly a warning
should be logged, with instructions on how to rebuild these files.

This message was sent by Atlassian Jira

View raw message