db-derby-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bryan Pendleton <bpendleton.de...@gmail.com>
Subject Re: Question about how Derby uses it's "tmp/" sub-directory - and what to do when it goes missing!
Date Sat, 04 Sep 2010 15:10:23 GMT
> I basically don't have a good understanding of what the "tmp/" directory
> is used for, or when it's created or goes away.

I suspect that you are encountering some sort of resource exhaustion
situation, which is provoking a Derby bug of some sort.

The "tmp" directory is used by the low-level storage subsystem of Derby,
for purposes such as:
  - holding intermediate results during large external merge-sort runs
  - holding intermediate results during query processing

So, for example, if you are issuing a query which causes Derby to sort
a large amount of data, large enough that the sort can't be entirely
performed in memory, you cause an "external merge-sort", which uses
temporary files to hold the data. A GROUP BY or ORDER BY could cause
this, or a complicated join may do it if Derby's optimizer chooses a
merge-join strategy.

Another case is when Derby's optimizer chooses a hash join strategy, but
the table which is chosen to be hashed into memory is too large to fit
in memory, and the hash table overflows to disk.

I think there are also cases with scrollable updatable cursors which
can cause in-memory hash tables to need to overflow to disk.

The types of resources that you might run out of that are relevant to
these processing choices are: memory and open file descriptors.

I know that Derby has some theoretical problems in its handling of
open file descriptors for truly enormous sorts. See, for example:

The bottom line is: I think you are encountering a bug in Derby, and
I think you should file it in the Derby bug-tracking system and start
trying to gather as much information as you can, in order to get the
best chance of identifying and resolving it.

In the meantime, you *might* be able to work around your problem by either
a) giving Derby substantially more memory, to reduce its need to perform
external merge-sorts
b) checking on whether you are hitting a file descriptor limit. If this
is a Unix-based system (Linux/Solaris/MacOSX/etc.), you may be able to
configure the system to give your process more file descriptors, which
could avoid a "too many open files" error if that's what's causing this.

Hope this helps,


View raw message