samza-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From José Barrueta <>
Subject Samza YarnJobFactory support for https
Date Fri, 22 May 2015 03:03:21 GMT
Hi all,

Once we figure it out the problem we were able to easily come up with a
solution for this.

Basically, we want to be able to set the `yarn.package.path` property to
look for an artifact over `https`, when we did this we ran into this

Exception in thread "main" No FileSystem for scheme:
at org.apache.hadoop.fs.FileSystem.getFileSystemClass(
at org.apache.hadoop.fs.FileSystem.createFileSystem(
at org.apache.hadoop.fs.FileSystem.access$200(
at org.apache.hadoop.fs.FileSystem$Cache.getInternal(
at org.apache.hadoop.fs.FileSystem$Cache.get(

First we look at the actual Yarn Resource Manager and make sure it
supported the https file system, so after a while we looked at the
YarnJobFactory code and found out the current implementation.

 class YarnJobFactory extends StreamJobFactory {
  def getJob(config: Config) = {
    // TODO fix this. needed to support http package locations.
    val hConfig = new YarnConfiguration
    hConfig.set("fs.http.impl", classOf[HttpFileSystem].getName)

    new YarnJob(config, hConfig)

And like I said, after this it was easy to fix the issue, we just created
our own YarnJobFactory

 * YarnJobFactory is an implementation based on Samza's {@link
 * implementation.
 * @since 0.1.0
public class YarnJobFactory implements StreamJobFactory {

    public StreamJob getJob(Config config) {

        Configuration yarnConfig = new YarnConfiguration();

        return new YarnJob(config, yarnConfig);

This one supports both, schemes http and https, I noticed the comment for
the current implementation, is there a way I can contribute to enhance this
implementation, I'm thinking maybe the Samza configuration might specify
the schema and map to a FileSystem instance.


Jose Luis

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message