db-derby-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Segel <mse...@segel.com>
Subject Re: single file database?
Date Fri, 16 Dec 2005 14:45:44 GMT
On Friday 16 December 2005 2:16 am, Roger Keays wrote:
> Michael Segel wrote:
> > On Wednesday 14 December 2005 6:39 pm, Roger Keays wrote:
> >>>If a developer uses Derby for storage this
> >>>would get cumbersome.  Zipping the derby DB
> >>>directory is one approach, but that
> >>>complicates application saves, crash recovery,
> >>>etc.
> >>
> >>That is what I had in mind. I read the Derby already offers read-only
> >>access to zipped files so thats already half way there right?
> >
> > Uhm no.
> http://db.apache.org/derby/docs/10.1/devguide/cdevdeploy11201.html#cdevdepl
> ?
Yes, that is correct. However, no, your assertion that you are "half-way" 
there is wrong.

They are two totally different beasts.

Before you even start to want to develop this in to Derby, you're going to 
have to decide the direction that you want to take Derby. Do you want to keep 
this as a simple quick and dirty database that can be used in embedded 
systems? Or do you want to create more mature features that are found in 
products like Informix and DB2.  The issue is that as you add more mature 
features, you increase not only the complexity of the code but also the 

Since Derby is still somehow tied to IBM's Cloudscape, you then have potential 
political issues. On the other hand, Sun which doesn't have their own RDBMS 
like their competitors may look to Derby having these features. (You have to 
look beyond the "altruistic spin" of both companies.)

Then you have the issue that once you decide on going down this path, what 
changes to the core framework of Derby are necessary to make this happen.
Who makes these decisions? 

Then you have the issue of skill set. Its nice to see folks from Sun and IBM 
who seem to only have the ability to volunteer to guide others rather than do 
the work themselves. (Time commitment is one thing. IP Leakage is another).

Having said all of that, you have to design the tablespace, how to handle 
multiple page sizes (4-16-64 K page sizes) and then the size of the meta data 
for each page. [Note, I'm thinking in terms of C, this may not be an issue in 
a Java based design.]

Then once you have your design, your internal APIs, you have to code it up and 
test it.

A database is more than a persistance of OO data. Nor is it a commodity. Its 
not just important to have a working solution, but one that is efficient, 
clean and fast. ;-)

A lot of work and not that many people have the combination of skills, time, 
or desire to do this from scratch and still maintain their day job.

Its not a one man job.

Now you can "cheat" by reviewing how Postgres handles the concept of 
tablespaces. Use it to give you a general feel for what has to occur and what 
types of structures are required.  Then once you have that understanding, you 
then have to apply it towards your design.  You can even look at IBM and 
Informix's documentation and red books to get an idea on their design.

Or you can do it from scratch. 
And if you're a *real programmer* you write your classes using cat.* ;-)
 *["Q: How do you rate a C programmer on a scale of 1 to 10; Whats a 10? 
      A: He can write a device driver using cat."] 

But hey, what do I know? ;-)

View raw message