beam-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eugene Kirpichov (JIRA)" <>
Subject [jira] [Assigned] (BEAM-2712) SerializablePipelineOptions should not call FileSystems.setDefaultPipelineOptions.
Date Tue, 01 Aug 2017 23:53:00 GMT


Eugene Kirpichov reassigned BEAM-2712:

    Assignee:     (was: Kenneth Knowles)

> SerializablePipelineOptions should not call FileSystems.setDefaultPipelineOptions.
> ----------------------------------------------------------------------------------
>                 Key: BEAM-2712
>                 URL:
>             Project: Beam
>          Issue Type: Bug
>          Components: runner-apex, runner-core, runner-flink, runner-spark
>            Reporter: Eugene Kirpichov
> introduces SerializablePipelineOptions, which
on deserialization calls FileSystems.setDefaultPipelineOptions.
> This is obviously problematic and racy in case the same process uses SerializablePipelineOptions
with different filesystem-related options in them.
> The reason the PR does this is, Flink and Apex runners were already doing it in their
respective SerializablePipelineOptions-like classes (being removed in the PR); and Spark wasn't
but probably should have.
> I believe this is done for the sake of having the proper filesystem options automatically
available on workers in all places where any kind of PipelineOptions are used. Instead, all
3 runners should pick a better place to initialize their workers, and explicitly call FileSystems.setDefaultPipelineOptions
> It would be even better if FileSystems.setDefaultPipelineOptions didn't exist at all,
but that's a topic for a separate JIRA.
> CC'ing runner contributors [~aljoscha] [~aviemzur] [~thw]

This message was sent by Atlassian JIRA

View raw message