db-derby-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Knut Anders Hatlen <Knut.Hat...@Sun.COM>
Subject Re: Performance question
Date Fri, 13 Jul 2007 09:19:33 GMT
Steve Pannier <steve.pannier@qlogic.com> writes:

> I have a question regarding Derby performance:
> We will be developing a Java application to collect and store statistics
> and counters from devices, so that we can then graph the performance of
> these devices.  We need to store statistics on a per-second basis for up
> to several months at a time, and every second we need to be able to
> store stats for up to 400 devices.  Plus, each device can have up to 50
> statistics/counters ... every second.  So you can see there will be
> large amounts of data being stored.
> We are hoping Derby will meet our needs as it is free, has a small
> footprint, and is Java-based.  I have been testing Derby to see how it
> performs storing large amounts of data in this environment, and I've
> seen a slight degradation of performance over time.  I'm using Derby
>  I track the amount of time it takes for each iteration, with
> one "iteration" being an insert of 400 records.  My last test was 1
> million iterations, and the average time for each iteration slowly
> increased over the life of the test.  (I've attached a file showing
> average times.)
> I did read the "Tuning Derby" manual, and applied the following tuning
> changes while running my tests:
> - Set autocommit to false (I programmatically do a commit
>   every 2000 inserts)
> - Changed derby.storage.pageSize to 16384
> - Changed derby.storage.pageCacheSize to 5000
> - Use a PreparedStatement for my inserts

Sounds like sensible settings. I'm not sure about the pageSize property,
though. Derby performs some auto-tuning of the page size for each table,
so you might want to re-run the test without that property (and also
increase the pageCacheSize to compensate for the smaller pages).

> My table is called "stats", and is defined as follows:
>     "create table STATS (" +
>     "query_id int not null," +
>     "create_time bigint not null," +
>     "switch char(20), port int, " +
>     "intProp1 int, intProp2 int," +
>     "intProp3 int, intProp4 int," +
>     "intProp5 int, intProp6 int," +
>     "intProp7 int, intProp8 int," +
>     "strProp1 char(20), strProp2 char(20)," +
>     "strProp3 char(20), strProp4 char(20)," +
>     "strProp5 char(20), strProp6 char(20)," +
>     "strProp7 char(20), strProp8 char(20)," +
>     "longProp1 bigint, longProp2 bigint," +
>     "longProp3 bigint, longProp4 bigint," +
>     "longProp5 bigint, longProp6 bigint," +
>     "longProp7 bigint, longProp8 bigint," +
>     "longProp9 bigint, longProp10 bigint," +
>     "longProp11 bigint, longProp12 bigint," +
>     "longProp13 bigint, longProp14 bigint," +
>     "longProp15 bigint, longProp16 bigint," +
>     "longProp17 bigint, longProp18 bigint," +
>     "longProp19 bigint, longProp20 bigint," +
>     "longProp21 bigint, longProp22 bigint," +
>     "longProp23 bigint, longProp24 bigint," +
>     "longProp25 bigint, longProp26 bigint," +
>     "longProp27 bigint, longProp28 bigint," +
>     "longProp29 bigint, longProp30 bigint," +
>     "longProp31 bigint, longProp32 bigint," +
>     "PRIMARY KEY ( create_time, switch, port ))"
> Does anyone have any suggestions or ideas as to why I'm seeing the
> performance degrade over time?

My first guess is that the disk is causing the slowdown. Disks normally
write faster to lower cylinder numbers than to higher cylinder
numbers. Since your test inserts (I think) several gigabytes, the
database might grow into slower areas of the disk.

Another thing that could influence the performance negatively, is that
the primary key index is getting bigger as more rows are inserted. Derby
organizes its indexes as B-trees, and as the number of rows in the table
increases, the B-tree grows higher and the cost of inserting new rows
grows logarithmically.

> Is there a better way of defining my
> "stats" table? Any other tuning parameters I can apply?

If you have the luxury of owning two disks, this update-intensive load
probably benefits from putting the database and the transaction log on
separate disks. You would have to create the database with a URL like


You could also try to have multiple threads inserting rows in
parallel. That might give you better CPU utilization since there will be
threads ready to work while other threads are blocked waiting for disk

Knut Anders

View raw message