mxnet-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Junru Shao (JIRA)" <>
Subject [jira] [Created] (MXNET-1417) [Performance] Caching Dynamic Shape Checking Result
Date Tue, 18 Jun 2019 09:00:00 GMT
Junru Shao created MXNET-1417:

             Summary: [Performance] Caching Dynamic Shape Checking Result
                 Key: MXNET-1417
             Project: Apache MXNet
          Issue Type: Improvement
            Reporter: Junru Shao

h2. Description

(Please see appendix for experiment details)

PR [#1324|] that enables dynamic shapes
slows down a model that originally runs in 235.65 ms by 7.26 ms (to 242.91 ms).

Also noted that a seemingly relevant PR [#14665|] suggesting
itself to be improving "[performance]", does not change performance number in any means -
It still runs in 242.35 ms.

This PR fixes this by caching the checking result of whether dynamic shape exists. The mechanism
itself is quick simple: if the dynamic shape existence has been checked, let's simply don't
do it again, because the graph does not change.
h2. Checklist
h3. Essentials

Please feel free to remove inapplicable items for your PR.
 *  The PR title starts with [MXNET-$JIRA_ID], where $JIRA_ID refers to the relevant [JIRA
issue|] created (except PRs with tiny
 *  Changes are complete (i.e. I finished coding on this PR)
 *  All changes have test coverage:
 * Unit tests are added for small changes to verify correctness (e.g. adding a new operator)
 * Nightly tests are added for complicated/long-running ones (e.g. changing distributed kvstore)
 * Build tests will be added for build configuration changes (e.g. adding a new build option
with NCCL)
 *  Code is well-documented:
 * For user-facing API changes, API doc string has been updated.
 * For new C++ functions in header files, their functionalities and arguments are documented.
 * For new examples, is added to explain the what the example does, the source of
the dataset, expected performance on test set and reference to the original paper if applicable
 * Check the API doc at [$PR_ID/$BUILD_ID/index.html]
 *  To the my best knowledge, examples are either not affected by this change, or have been
fixed to be compatible with this change

h3. Changes

h2. Comments

Experiment environment: EC2 p2.8xlarge, CUDA 10 and cuDNN 7.5. The model itself is confidential.

The detailed benchmark is as below (mean ± stdev). The experiment is conducted in 20 runs,
warmup run is excluded.
 # On commit [{{39412b3}}|] (right
before PR [#14192|] is merge):
Hybridize w/ static_alloc: 235.65 ± 0.22246 ms

 # On commit [{{83d2c2d}}|] (where
PR [#14192|] is merged):
Hybridize w/ static_alloc: 242.91 ms ± 0.71125 ms

 # PR [#14665|] patched to commit [{{83d2c2d}}|]
Hybridize w/ static_alloc: 242.35 ± 0.25124 ms

 # After this patch applied to commit [{{83d2c2d}}|]
Hybridize w/ static_alloc: 234.95 ± 0.39334 ms

CC: [@szha|] [@zheng-da|] please review

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message