db-derby-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mike Matrigali <mikem_...@sbcglobal.net>
Subject Re: NFS and Derby
Date Thu, 11 Nov 2010 19:27:58 GMT
Kathey Marsden wrote:
> I have always told users they have to have their databases on a local 
> disk to ensure data integrity and that  a system crash for an NFS 
> mounted database could cause fatal corruption, but had a user this 
> morning take me to task on this and ask me to explain exactly why.  I 
> gave my general response about not being able to guarantee a sync to 
> disk over the network, but want to have a more authoritative reference 
> for why  you cannot count on an NFS mounted disk although I did find 
> several places where the sync option "favors data integrity" which 
> certainly doesn't sound like a guarantee.  Does anyone know a good 
> general reference I can use on this topic to support my "you gotta use a 
> local disk" mantra.
The problem is one of documentation and implementation of nfs.  I don't
think there is just one "nfs" out there.  And there are definitely all 
sorts of other remote mounting options.

Some of the problems that can arise, that are avoided in local disk and
thus why to be safe we have documented we can't guarantee support include:

1) We may not be able to prevent dual booting and thus db may get corrupted.
All of our algorithms for preventing dual booting rely on the jvms that
are accessing the database to be on the same machine.  Once 2 machines 
can access the same file we have no way to prevent corruption.

2) Derby depends on synchonous write behavior when requested.  Basically 
at certain times Derby asks the JVM to guarantee that data to a table or 
recovery log file has been written and forced to disk before returning.
If this syncing is not correct a number of database problems can happen
such as:
1) we tell user a transaction was commited because we believe the log
    was forced, but the nfs was caching the result and crashes.  Now
    the committed xact is not there.
2) we want to remove some recovery log so we force data to disk, wait 
for it to hit disk and the delete the log file for those disk updates.
But data is actually cached and lost and now we have old data in the
db and no log files to recover it from.

  When this was first documented I don't believe any JVM implementation 
on top of nfs could guarantee a completed synchronous write.
It may be the case that certain remote file system implementations now 
can guarantee this, and it may be the case that the JVM implementations 
make the right calls to the nfs file system to do this - but I believe 
it is a support nightmare to try and support this.

A quick google of nfs topics seems to indicate that there may be some 
versions of nfs that do support write sync.  I believe this because most
of the hits that I got were descriptions of how to disable the syncing 
to get better performance, indicating that many of nfs that might 
support write sync actually have it disabled.  I did not see anyway that 
a java program could find out if the required syncing was being enforced.

Note that we also can not guarantee recovery on disks with write cache
enabled, which I believe many users have set.  Many may not even know it
as I believe it is the default for some disk installations.

> Also I think our documentation on this topic should be a bit stronger.  
> Currently we just say it may not work and probably should be clearer 
> that data corruption could occur.  I will file an issue to beef up the 
> language based on the conversation in this thread.
> http://db.apache.org/derby/docs/10.5/devguide/cdevdvlp40350.html
> Thanks
> Kathey

View raw message