systemml-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Frederick R Reiss" <frre...@us.ibm.com>
Subject Re: Remove "Scratch Space" In Favor Of Temp Folder
Date Thu, 07 Apr 2016 17:15:54 GMT

Back when I was new to the system, the scratch_space folder that kept
mysteriously appearing and disappearing in random places was a source of
puzzlement. The way that I figured out what that folder is for was when I
deleted it and my SystemML process crashed. I think it would be good to put
those temp files someplace more private, or to make the default name name
something that makes it clearer the directory belongs to SystemML.

Fred

Matthias Boehm/Almaden/IBM@IBMUS wrote on 04/02/2016 08:32:08 PM:

> From: Matthias Boehm/Almaden/IBM@IBMUS
> To: dev@systemml.incubator.apache.org
> Date: 04/02/2016 08:32 PM
> Subject: Re: Remove "Scratch Space" In Favor Of Temp Folder
>
> just to clarify, the configuration 'scratch' (remote tmp working
> directory) is a user-defined configuration coming out of SystemML-
> config.xml with internal default set to ./scratch_space if not
> specified and it is always accessed as dfs (which depending on your
> hadoop configuration might use different file system
> implementations, i.e., hdfs, gpfs, fs, etc).
>
> From my perspective, we should definitely keep the ability to
> specify a path for both local and remote tmp working directories
> because it really simplifies debugging. This is especially true if
> driver/client and executors/tasks run under different users (e.g.,
> with LinuxTaskController, LinuxContainerExecutor, or Spark's yarn-
> client). Btw, these scenarios are indeed good use cases for absolute
> paths because a relative path (if not handled correctly) actually
> refers to different locations for driver/executors.
>
> I would be fine with renaming this configuration to something like
> 'remotetmpdir' (consistent with our 'localtmpdir') and automatically
> obtain temp working directories from hadoop if not specified.
>
> Regards,
> Matthias
>
> [image removed] Mike Dusenberry ---03/31/2016 10:58:44 AM---Hi all,
> Currently, SystemML makes use of a "scratch space" folder for temporary
>
> From: Mike Dusenberry <dusenberrymw@gmail.com>
> To: dev@systemml.incubator.apache.org
> Date: 03/31/2016 10:58 AM
> Subject: Remove "Scratch Space" In Favor Of Temp Folder
>
>
>
> Hi all,
>
> Currently, SystemML makes use of a "scratch space" folder for temporary
> files during execution.  This is currently set to a relative
> "scratch_space" directory that will be placed relative to the execution
> path (local mode) or in the user's directory on HDFS.  This works okay in
> some cases, although it can cause confusion as to why the folder exists.
> In other cases, such as on Databricks Cloud, a relative path for HDFS is
> not allowed, and thus the user must change this "scratch space" folder to
> an absolute path, or else a strange error message will occur.
>
> Since this "scratch space" folder is just for temporary files during
> execution, might it be better to simply query HDFS (which falls back to
> local FS if need) for a temporary folder, and just use that?  If so, this
> would remove the need to adjust this setting, thus making it easier to
use
> SystemML.
>
> Thoughts?
>
>
> - Mike
>
> --
>
> Michael W. Dusenberry
> GitHub: github.com/dusenberrymw
> LinkedIn: linkedin.com/in/mikedusenberry
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message