jclouds-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF subversion and git services (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (JCLOUDS-1488) Filesystem list call with prefix is slow in large containers
Date Wed, 30 Jan 2019 01:47:00 GMT

    [ https://issues.apache.org/jira/browse/JCLOUDS-1488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16755561#comment-16755561

ASF subversion and git services commented on JCLOUDS-1488:

Commit 29eec441e902162394a5dbceaf98c14d3b2bc87a in jclouds's branch refs/heads/local-blobstore/list-prefix
from Andrew Gaul
[ https://gitbox.apache.org/repos/asf?p=jclouds.git;h=29eec44 ]

JCLOUDS-1371: JCLOUDS-1488: list optimize prefix

Previously getBlobKeysInsideContainer returned all keys and filtered
in LocalBlobStore.  Now getBlobKeysInsideContainer filters via prefix
which can dramatically decrease the number of keys returned,
especially for the filesystem provider.  Further optimizations are
possible for delimiter.

> Filesystem list call with prefix is slow in large containers
> ------------------------------------------------------------
>                 Key: JCLOUDS-1488
>                 URL: https://issues.apache.org/jira/browse/JCLOUDS-1488
>             Project: jclouds
>          Issue Type: Bug
>          Components: jclouds-blobstore
>    Affects Versions: 2.1.1
>         Environment: Java version: java version "1.8.0_131"
> Operating system: Fedora 27 x86_64
>            Reporter: Lari Sinisalo
>            Priority: Major
>              Labels: filesystem
>         Attachments: JCLOUDS1488.java
> When the filesystem blobstore is used, running the following code takes very long if
there are a lot of files in the container:
> {code:java}
>     ListContainerOptions options = new ListContainerOptions();
>     options.prefix("test-container-subdirectory/");
>     Set<? extends StorageMetadata> results =
>       blobStore.list("test-container",options);
> {code}
> See the attached Java source file [^JCLOUDS1488.java] for the full code.
> On my system, running the attached Java code takes over 10 seconds to list a single file
if there are 500,000 files in the container outside that prefix.
> Output from the attached code:
> {code:java}
> Number of blobs listed: 1
> First listed blob: test-container-subdirectory/file-to-list
> Time it took to list the blobs: 13256 ms
> {code}
> A more general version of this problem was reported previously in JCLOUDS-1371.

This message was sent by Atlassian JIRA

View raw message