db-derby-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Lily Wei <lily...@yahoo.com>
Subject Re: Derby in-memory back end - where to go next?
Date Fri, 25 Sep 2009 23:26:26 GMT
Thank you so much for the information.
With more smart phone on the market, I think people will use the in-memory feature more and
appricate the performance improvment. 


Thanks again,

From: Kristian Waagan <Kristian.Waagan@Sun.COM>
To: Derby Discussion <derby-user@db.apache.org>
Sent: Friday, September 25, 2009 2:45:53 AM
Subject: Re: Derby in-memory back end - where to go next?

Rick Hillegas wrote:
> Hi Lily,
> Some responses inline...
> Lily Wei wrote:
>> Hi Rick:
>>      I have some follow up questions.
>> Middle-tier caching, monitoring transient data streams and test rigs totally make
>> Do you see any benchmark in turn of how derby helps these applications?
> I don't think we've published any figures on the performance boost you get from running
in memory. My anecdotal recollection is that you see a significant boost once you've gotten
past database creation. Kristian has done the most extensive testing and may have some figures
that he can share. Unfortunately, he suffered an accident earlier this week and is up on the
blocks for a while.

Hello Rick and Lily,

The performance benefit you'll see with the in-memory back end is highly dependent on the
load and the underlying disk subsystem.
For write intensive loads the boost can be in orders of magnitude.
For read intensive loads the boost can be close to zero.

If you have a read-only database, it may be better in some cases to keep the database on disk,
maximize the page cache size and then prime the cache (pulling all pages into the cache).
The downside of using the in-memory back end in such a scenario, is that some of the data
will be stored twice: once in the "virtual in-memory file system" and once in the page cache.
For the same reason, you should tweak the page cache size accordingly to your amount of data
and heap size. Minimizing the page cache  (i.e. allowing only 40 pages) to avoid the "data
duplication" problem is not a good idea for optimal performance...
For some more information about the effects of page cache size and page size, see [1]. It
is really a comparison between two implementations of an in-memory back end, but closer to
the end of the document there are some relevant experiments.

Unfortunately I'm unable to find the numbers I had comparing the disk based back end with
the in-memory back end.
If anyone wants some hard numbers, they can try running the various performance clients found
in the source code repository (under trunk/testing/.../perf/clients). The simplest ones are
the single record operation clients and the bank_tx load.

In my opinion, the primary use cases for the current in-memory back end are testing and development.
In the next release it may be better suited for storing purely transient data in a production
environment as well (with a proper delete mechanism and maybe a size limit feature).

-- Kristian

[1] https://issues.apache.org/jira/secure/attachment/12400859/derby-646-performance_comparison_1a.txt
>> In aspect such as performance, totally memory consumption or reduce hardware cost?
>>      Do you see other embedded databases that also provide solution on the stripped-down
> I don't think that H2 or HSQLDB run on CDC.
> Regards,
> -Rick
>> Do you have any data point for Derby?
>> Thank you so much for shed some lights for people like me,
>> Lily
>> *From:* Rick Hillegas <Richard.Hillegas@Sun.COM>
>> *To:* Derby Discussion <derby-user@db.apache.org>
>> *Sent:* Wednesday, September 9, 2009 2:01:01 PM
>> *Subject:* Re: Derby in-memory back end - where to go next?
>> Hi Lily,
>> Some comments inline...
>> Lily Wei wrote:
>> >
>> > Hi Rick:
>> >
>> >      Thank you so much for sharing the information with the group.
>> >
>> > >* It would be great to be able to bound the growth of the in-memory db
>> >
>> > Is there a trend for need of in-memory db on JAVA world?
>> >
>> I find that this consistently generates a lot of discussion whenever I talk about
10.5 features with users.
>> >
>> > Is it mainly for applications, i.e. ERP, CRM, SRM?
>> >
>> The top use-cases which keep coming up are:
>> o Middle-tier caching -- here people use Derby in the middle tier in order to scale
out access to a big back end like Oracle or DB2. Running in memory makes this perform even
>> o Monitoring transient data streams - here you slice and dice the data while the
monitoring application is up but you don't necessarily need to keep the data after the monitoring
session ends.
>> o Test rigs -- here you can use Derby on your laptop to run regression tests against
an application which will run in production on a big back end like Oracle or DB2; the rig
is lightweight and cleans up after itself.
>> >
>> > What kind of solution JAVA can provide for smart device like iPhone, RIMM or
Plam? i.e. Will JAVA play well with WindowMobile or Arnoid?
>> >
>> Our small device story is our ability to run on the stripped-down CDC VM. Being able
to run completely in memory gives this story extra appeal too.
>> Thanks,
>> -Rick
>> >
>> > > > Thank you for shed the lights for us in advance,
>> >
>> > Lily
>> >
>> >
>> > *From:* Rick Hillegas <Richard.Hillegas@Sun.COM <mailto:Richard.Hillegas@Sun.COM>>
>> > *To:* Derby Discussion <derby-user@db.apache.org <mailto:derby-user@db.apache.org>>
>> > *Sent:* Wednesday, September 9, 2009 11:13:05 AM
>> > *Subject:* Re: Derby in-memory back end - where to go next?
>> >
>> > Hi Kristian,
>> >
>> > Here's another piece of feedback: Last night I gave an overview of Derby to
the San Francisco Java User's Group. A developer asked whether the growth of the in-memory
database could be bounded. He had a use case which we didn't explore in depth but which involved
periodically truncating the database. I asked him to bring his requirements to the Derby user
list so that we could feed them into your spec effort. Here are my takeaways:
>> >
>> > * It would be great to be able to bound the growth of the in-memory db
>> >
>> > * It would be great if the memory occupied by deleted records could be released
>> >
>> > Thanks,
>> > -Rick
>> >
>> > Kristian Waagan wrote:
>> > > Hello,
>> > >
>> > > In Derby 10.5 an in-memory back end, or storage engine, was included. It
stores all the data in main memory, with the exception of derby.log. If this is news to you,
and you want a quick intro to it, see [1] and [2].
>> > >
>> > > I'm trying to gather some feedback on whether the current implementation
is found acceptable, or if there are additional features people would like to see. I expect
some wishes to emerge, and I plan to record these on the wiki page [1]. The page can then
be used to guide further work in this area.
>> > >
>> > > To start the discussion, I'll list some potential features and tasks. Feel
free to comment on any one of them either by replying to this thread, or by adding your comments
to [1]. It can be a +1 or -1 on the feature itself, a suggestion for a new feature, or details
on what a feature should look like.
>> > >
>> > >
>> > > * Documentation
>> > > Must at least document the JDBC subsubprotocol, and also explain how to
delete in-memory databases.
>> > > If new features are added, these must be documented as well.
>> > >
>> > > * Deletion of in-memory databases
>> > > Currently the only ways to delete an in-memory database are to restart
the JVM or use a static method that isn't part of Derby's public API. A proper mechanism for
deletion should be added.
>> > >
>> > > * Automatic deletion on database shutdown (or when last connection disconnects)
>> > >
>> > > * "Anonymous in-memory databases"
>> > > A database which only the connection creating it can access, and when the
connection goes away the database goes away.
>> > >
>> > > * Automatic persistence
>> > > The database could be persisted to disk automatically based on certain
criteria. The most obvious ones are perhaps on a fixed interval and on JVM shutdown.
>> > >
>> > > * Monitoring
>> > > The most basic information is how many in-memory databases exist in the
current JVM, and how big they are. How should this information be presented? Should it be
available to anyone having a connection to the current JVM?
>> > >
>> > > * No derby.log
>> > > Include a class in Derby that will discard everything written to derby.log.
>> > >
>> > >
>> > > Thank you for your feedback,
>> >
>> >

View raw message