jackrabbit-oak-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Amit Jain (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (OAK-3122) Direct copy of the chunked data between blob stores
Date Tue, 21 Jul 2015 06:02:04 GMT

    [ https://issues.apache.org/jira/browse/OAK-3122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14634560#comment-14634560

Amit Jain commented on OAK-3122:

bq. Get handle to all top level blobIds
Would need to iterate over the nodes to get these but the original goal of direct copying
won't be possible then.

bq. Use the blobId has the name of the file on FileSystem. Further create the file in same
way as done in FileDataStore i.e. 3 level depth
This may cause problems if the file path becomes quite long which I think it can in this case.

bq. Modify DataStoreBlobStore to allow extracting length from id and then extract encoded
length in blobId to determine the length
Did not understand this point. 

It seems to me we need some sort of an ID translation layer to support AbstractBlobStore ->
DataStore cases.

> Direct copy of the chunked data between blob stores
> ---------------------------------------------------
>                 Key: OAK-3122
>                 URL: https://issues.apache.org/jira/browse/OAK-3122
>             Project: Jackrabbit Oak
>          Issue Type: New Feature
>          Components: core, mongomk, upgrade
>            Reporter: Tomek Rękawek
>             Fix For: 1.4
> It could be useful to have a tool that allows to copy blob chunks directly between different
stores, so users can quickly migrate their data, without a need to touch the node store, consolidate
binaries, etc.
> Such tool should have direct access to the methods operating on the binary blocks, implemented
in the {{AbstractBlobStore}} and its subtypes:
> {code}
> void storeBlock(byte[] digest, int level, byte[] data);
> byte[] readBlockFromBackend(BlockId blockId);
> Iterator<String> getAllChunkIds(final long maxLastModifiedTime);
> {code}
> My proposal is to create a {{ChunkedBlobStore}} interface containing these methods, which
can be implemented by {{FileBlobStore}} and {{MongoBlobStore}}.
> Then we can enumerate all chunk ids, read the underlying blocks from source blob store
and save them in the destination.

This message was sent by Atlassian JIRA

View raw message