hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Steve Loughran (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-15193) add bulk delete call to metastore API & DDB impl
Date Mon, 08 Oct 2018 15:15:00 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-15193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16641997#comment-16641997
] 

Steve Loughran commented on HADOOP-15193:
-----------------------------------------

DDB batch delete just takes the list of operations and runs through them in sequence, retrying
if needed. There is no speedup compared to making individual requests

We do need a call in the metastore API though, as it can be a bit cleverer about the operation.

In particular: if I delete a directory, do I need to explicitly add deleted markers to all
the children, or would a delete marker on the dir be enough? If so, you could be very efficient
& not create deleted file markers, just those for the directories 


> add bulk delete call to metastore API & DDB impl
> ------------------------------------------------
>
>                 Key: HADOOP-15193
>                 URL: https://issues.apache.org/jira/browse/HADOOP-15193
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>    Affects Versions: 3.0.0
>            Reporter: Steve Loughran
>            Priority: Major
>
> recursive dir delete (and any future bulk delete API like HADOOP-15191) benefits from
using the DDB bulk table delete call, which takes a list of deletes and executes. Hopefully
this will offer better perf. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


Mime
View raw message