flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From NicoK <...@git.apache.org>
Subject [GitHub] flink pull request #4961: [FLINK-7973] fix shading and relocating Hhadoop fo...
Date Mon, 06 Nov 2017 19:23:38 GMT
GitHub user NicoK opened a pull request:


    [FLINK-7973] fix shading and relocating Hhadoop for the S3 filesystems

    ## What is the purpose of the change
    The current shading of the `flink-s3-fs-hadoop` and `flink-s3-fs-presto` projects also
relocates Flink core classes and even some from the JDK itself. Additionally, the relocation
of Hadoop does not work as expected since Hadoop loads classes based on class names in its
`core-default.xml` which are unshaded and thus use the original namespace.
    ## Brief change log
    - adapt the `pom.xml` of both `flink-s3-fs-hadoop` and `flink-s3-fs-presto`:
      - do not shade everything and instead define include patterns explicitly
      - only shade and relocate Flink classes imported from flink-hadoop-fs
    - hack around Hadoop loading (unshaded/non-relocated) classes based on names in the `core-default.xml`
by overwriting the `Configuration` class (we may need to also extend this for the `mapred-default.xml`
and `hdfs-defaults.xml` and their respective configuration classes in the future):
      - provide a `core-default-shaded.xml` file with shaded class names and
      - copy and adapt the `Configuration` class of the respective Hadoop version to load
this file instead of `core-default.xml`
    ## Verifying this change
    This change can (and was) manually tested as follows:
    - verify the shaded `jar` file does not contain non-relocated classes
    - verify the changed `Configuration` classes reside in the shaded namespace where the
original Hadoop `Configuration` classes would go into, e.g. `org.apache.flink.fs.s3hadoop.shaded.org.hadoop.conf`
(look for `core-default-shaded.xml` string in the `Configuration.class` file)
    - verify the `META-INF/services` files are still correct (name + content)
    ## Does this pull request potentially affect one of the following parts:
      - Dependencies (does it add or upgrade a dependency): (no)
      - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (no)
      - The serializers: (no)
      - The runtime per-record code paths (performance sensitive): (no)
      - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing,
Yarn/Mesos, ZooKeeper: (no)
      - The S3 file system connector: (yes)
    ## Documentation
      - Does this pull request introduce a new feature? (no)
      - If yes, how is the feature documented? (not applicable)

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/NicoK/flink flink-7973

Alternatively you can review and apply these changes as the patch at:


To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #4961



View raw message