commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vince Bonfanti <>
Subject Re: [vfs] MemcachedFilesCache for Google App Engine (object serialization)
Date Sun, 07 Jun 2009 16:31:23 GMT
In order to solve this serialization problem, I'm trying to understand
exactly what the FilesCache is used for. Most importantly, I'm trying to
determine whether the FilesCache is used to cache file content.

I think the answer is: the FilesCache is used to optimize resolveFile()
performance, and to reuse FileObject instances, but is not used to cache the
actual file content. I came to this conclusion based on the
DefaultFileContent class, where the input and output streams are stored in a
ThreadLocal variable.
If my understanding is correct, then my task is greatly simplified. Do I
understand this properly? Thanks.
On Thu, Jun 4, 2009 at 5:22 PM, Vince Bonfanti <> wrote:

> I'm investigating creating a memcached-based FilesCache implementation for
> use with Google App Engine. The basic obstacle is that this requires that
> all objects that are used as keys or values must be serializable. Before I
> go too far down this path, I'd like to know if this is a reasonable thing to
> do (or consider doing). Any thoughts or guidance would be greatly
> appreciated.
>  Background info: the Google App Engine (GAE) environment is inherently
> distributed, where an unknown number of multiple instances of your
> servlet-based web application will be running at the same time. The GaeVFS
> project ( implements a VFS plugin on top of the
> GAE datastore; this is needed because the "real" filesystem within GAE is
> read-only, so GaeVFS provides a writeable filesystem for use by
> applications. It seems that for proper operation of VFS, the FilesCache
> implementation used by GaeVFS should (must?) also be inherently distributed.
> Fortunately, GAE supplies a memcached API that will be perfect for this use
> (if I can solve the serialization problem).
> Here's my first cut at what classes I'd need to make serializable and some
> notes on which fields might be marked transient:
> *AbstractFileName*
> serialized fields: scheme (String), absPath (String), type (FileType)
> transient fields: uri, baseName, rootUri, extension, decodedPath
> *FileType*
> serialized fields: name (String), hasChildren (boolean), hasContent
> (boolean), hasAttrs (boolean)
> transient fields: none
> *AbstractFileObject*
> serialized fields: name (AbstractFileName), content (DefaultFileContent)
> transient fields: fs (AbstractFileSystem),  operation (FileOperations),
> attached (boolean), type (FileType), parent (FileObject), children
> (FileName[]), objects (List)
> It's very important the the fs (AbstractFileSystem) field be transient to
> limit the number of other classes that need to be made serializable. It
> looks like files are always retrieved from the files cache via the
> AbstractFileSystem.getFileFromCache() method; if so then the "fs" field of
> AbstractFileObject can be restored within this method and doesn't need to be
> serialized. We'll need to define a package-scope
> AbstractFileObject.setFileSystem() method to support this. Or, this could be
> done within the MemcachedFilesCache.getFile() method before returning the
> FileObject (but then we'd need a FileObject.setFileSystem method, or do some
> not-very-nice type casting).
>  *DefaultFileContent
> *serialized fields: file (AbstractFileObject), attrs (Map), roAttrs (Map),
> fileContentInfo (FileContentInfo), fileContentInfoFactory
> (FileContentInfoFactory), openStreams (int)
> transient fields: threadData (ThreadLocal)
> question: what types do the "attrs" and "roAttrs" maps contain? are these
> serializable?
> question: FileContentInfo and FileContentInforFactory are interfaces, what
> are the implementations of these?

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message