lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andi Vajda <>
Subject Re: GData Server - Lucene storage
Date Sat, 03 Jun 2006 00:50:49 GMT

On Fri, 2 Jun 2006, jason rutherglen wrote:

> Is it possible to turn off directory locking with BDB?  How is the 
> performance compared to regular FSDirectory for queries?

The DBLock class in the package (to which 
DbDirectory belongs) does absolutely nothing. This is because Berkeley DB will 
do the locking it needs to keep the transactions isolated anyway.

If you run several transactions concurrently, be ready to delve into the 
delicacies of recovering from aborted-by-deadlock transactions and/or avoiding 
hard deadlocks.

More about this here:


> ----- Original Message ----
> From: Andi Vajda <>
> To:; jason rutherglen <>
> Sent: Friday, June 2, 2006 10:52:27 AM
> Subject: Re: GData Server - Lucene storage
> On Fri, 2 Jun 2006, jason rutherglen wrote:
>> It might be interesting to merge using BDB into Solr, as an option to
>> provide better realtime updates.  Perhaps the replication could be used as
>> well in place of rsync?  I don't have any experience with BDB replication,
>> anyone have thoughts on the matter?
> If you're thinking of using Berkeley DB as a the store behind the Lucene index
> via the DbDirectory Directory implementation, here are a few things to keep in
> mind:
>   - always setUseCompoundFile(false)
>     don't use compound lucene index files on top of Berkeley DB:
>      . there is a bug that prevents this from working correctly
>      . it makes no sense anyway since it duplicates what DbDirectory is
>        already doing (all index files are stored in the same Berkeley DB file)
>      . it slows things down
>   - if you are using a transaction around all the index updates, you may want
>     to consider doing all the index updates in a RAMDirectory first and then
>     adding the RAMDirectory wholesale to the DbDirectory in that transaction.
>     This makes indexing considerably faster (3 times for me) and does a LOT
>     less thrashing around in Berkeley DB which can lead to a large number of
>     transactional log files rapidly filling up your hard drive.
> I'm not really sure if and how index merging works. For my use, having no
> merging is good enough since I never update existing documents, but always
> instead add a new version of them. The concept of version is tied to my
> application and each transaction corresponds to a new version.
> Andi..

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message