quetz-mod_python-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gregory (Grisha) Trubetskoy" <gri...@modpython.org>
Subject Re: Using shared memory to do global persistence
Date Fri, 18 Jul 2003 17:38:17 GMT

I just wanted to summarize some findings on shared memory on the list so
we can keep this discussion going:

Shared memory would be most useful for short-time volatile session data
storage. This is frequently used in the Java world by assigning attributes
to an HttpSession.

The complication with Apache is its multi-process nature (as opposed to,
say Java, which is multi-threaded). The challenge is to be able
efficiently share data across processes.

Sharing data across processes can be done by using shared memory. Most
os's support it in one form or another (shm or mmap), and APR provides a
cross-platform API to shared memory (and so does python as well).

One problem with shared memory is that to allocate space in it you have to
use something other than malloc(), therefore it's not possible to simply
allocate a Python object in shared memory since Python uses malloc. It may
be possible to copy a Python object into shared memory, but it becomes a
very complex issue if you figure in dealing with pointers and Python
reference counting. The paper on POSH describes it quite well.

But for simplicity's sake, lets assume that we're just dealing with
strings (the developer may chose to pickle object if he wishes).

The second problem is locking. JSP, for example, synchronizes access to
sessions so that no two requests belonging to the same session are
processed at the same time. This means that there needs to be a mechanism
for locking each unique session across different processes.

Although APR provides a way for creating global (i.e. across all
processes) mutexes, such a mutex must be created in post-config (i.e.
before forking) stage. So the APR global_mutex isn't very useful here,
unless we somehow compromise by using a lock per virtual server, or per
number of sessions, etc.

Also, APR (depending on OS) may use a type of lock that is limited, e.g.
there is only so many sys V semaphores or pthread mutexes that can be
allocated per process because they consume kernel resources.

POSH uses its own implementation of spin locks, which are (1) not portable
(spin locks must be written in assembly), (2) don't work at all on a
single processor machine.

Another complication with both shared memory and locks is that it is much
more difficult to create them from within a child process because base
addresses for same shm segment will differ. Therefore both locks and
shared memory have to be specified in the Apache config (and not
.htaccess).

Anyway, I just wanted to through that out there in case someone has
suggestions, comments, things to look at, etc. At this point I don't have
any plan on how to proceed with inter-process data sharing, and I'm not
sure if we should bother at all. It's doable, but sure seems like a lot of
work!

On Thu, 26 Jun 2003, Jonathan Gardner wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> On Thursday 26 June 2003 11:36, Greg Stein wrote:
> > APR has facilities to do shared memory in a portable fashion; APR is part
> > of Apache 2.0, so there isn't much reason to go grab any other library.
> >
> > You can also use APRUTIL's apr_rmm.h to manage sub-allocations within the
> > shared memory segments. The problem is that a shared memory segment could
> > be mapped to different addresses in different processes. Thus, you want to
> > hold onto offsets into a shared memory segment. apr_rmm.h helps with
> > managing these subblocks and working with offsets rather than direct
> > pointers. Note that apr_rmm also handles locking so that you can have
> > multiple processes allocating (simultaneously) from a shared mem segment.
> >
> > You can then layer additional Python facilities on top of this substrate.
> >
>
> The python facility would be something like POSH.
>
> So, based on this new information, the project scope would now become:
>
> 1) Expand POSH so that it can use shared objects that were shared by a foreign
> process (provided with some information on which shared memory segment they
> are using)
>
> 2) Integrate POSH with mod_python and APR.
>
> The problem I see now: How to communicate between all of the processes that
> there are shared objects available, and detail where those shared objects
> are? I don't think it is possible to create shared objects via mod_python
> before the processes are seperated. Even if it was, is it possible to
> transfer references to those shared objects to each process?
>
> The only solution I see right now is to have some central repository that any
> process can access and declare the existence of shared objects, their
> location, and whatever else is needed. Other processes can read the
> repository and find currently existing shared objects by a unique string.
>
> The exact nature of the repository isn't important. It could be a bit of
> shared memory in a special location, a file, a Berkely DB, or even something
> more exotic. The point is that the processes can declare new shares, or find
> existing shares by a unique identifier.
>
> - --
> Jonathan Gardner <jgardner@jonathangardner.net>
> (was jgardn@alumni.washington.edu)
> Live Free, Use Linux!
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.2.1 (GNU/Linux)
>
> iD8DBQE++z9mWgwF3QvpWNwRAmawAKDDzpi9kOyIu88CZaCxVTsCqYQ1uwCgymzr
> n6NkA9YggvsuqJcdzmnzpdc=
> =rHuV
> -----END PGP SIGNATURE-----
>
> _______________________________________________
> Mod_python mailing list
> Mod_python@modpython.org
> http://mailman.modpython.org/mailman/listinfo/mod_python
>

Mime
View raw message