db-derby-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From <de...@segel.com>
Subject RE: improve performance on large insert numbers
Date Tue, 22 Aug 2006 17:41:57 GMT
And why do you want to use a database for this task?

I was waiting for someone to ask this, but since no one did I guess I have
to do it... ;-)

What you've said, is that you wanted to load a lot of records in to a table,
really, really fast. 

No indexes. Nor have you identified any type of query or extract

So why use a database for this function?

A general purpose RDBMS will not be the fastest method of capturing and
storing this data.

What you may want to consider is capturing the initial data in to a flat
file, then using a separate process, load this data in to an indexed
database table(s) for future use.

Since you really didn't go into detail, let me use a hypothetical example of
a solution that may parallel your problem...

Let me give you an example application.

Suppose your application is set up to log any and all TCP/IP traffic on your
server. Depending on your server, there can be a lot of traffic.
So your application writes to a file or set of files (round robin or
something similar) with the traffic information. You then have a secondary
application that tracks when a file buffer is full and then writes that
buffer to the database. After completion, the second program tells the first
that the file buffer is cleared for deletion and then the app can delete and
reuse that buffer's name.

This way you then have indexes on the tables for fast retrieval and you can
use things like triggers to set up alerts based on the data being inserted.

In essence you're writing to a buffer pool prior to storage in the database.
(Only that in the event of a server crash, the data is safe(r) in the files
of the buffer pool, rather than in memory....)

Informix had an interesting take on this. The Financial Foundation which
included its real time loader. Instead of files, it used memory as its
buffer. Of course those who implemented solutions would tee off the data to
also write to a file too but that's another story.



> -----Original Message-----
> From: Mike Matrigali [mailto:mikem_app@sbcglobal.net]
> Sent: Tuesday, August 22, 2006 11:08 AM
> To: Derby Discussion
> Subject: Re: improve performance on large insert numbers
> is your machine cpu bound during the insert?  If not having log
> on separate disk may help.
> Does your app always load a large number of entries into an empty table,
> or does it also load a large number of entries into an existing table
> with entries.  If loading into an empty table then using one of the
> import system routines with a vti would avoid logging.
> SOA Work wrote:
> > Hi there,
> >
> > I'm currently using derby to log values applying in my application.
> > One value consists of a string, a date object and a double value. Not
> much data. But I want to insert  a huge amount of entries.
> >
> > At the moment inserting 10 000 entries takes about 2 seconds. Now I
> would like to here your opinion about how to improve performance (if
> possible).
> >
> > Here the ideas I'm currently using (comment if you want)
> >
> > 1.) I use a table without indexes, primary keys and so on.
> > "create table log(id varchar(50), value double, timestamp date)"
> >
> > 2.) I use the embedded mode
> >
> > 3.) I use a prepared statement "insert into log values (?, ?, ?)"
> >
> > 4.) I disabled auto commit and use addBatch for all 10000 entries. at
> the end I call executeBatch() and commit()
> >
> > I would like to here your suggestions or comments.
> > Best regards
> >
> > ______________________________________________________________________
> > XXL-Speicher, PC-Virenschutz, Spartarife & mehr: Nur im WEB.DE Club!
> > Jetzt gratis testen! http://freemail.web.de/home/landingpad/?mc=021130
> >
> >
> >

View raw message