lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sascha Fahl <>
Subject Re: Lucene vs. Database
Date Wed, 01 Oct 2008 07:55:58 GMT
there is a big conceptual difference. Lucene is working with an  
inverted index what
means that you have a list of words (terms) having a list of all  
documents that contain
these word (term). Databases usually are working with normal indexes  
what means
you have a document describing the words (terms) it contains. From my  
you should use a database for the details querys for a very simple  
reason. Querying
a document by its database id is quite fast. Depending on your DMBS  
the document
id is the primary key, what means you have a fast datastructure (often  
some B-Tree
like stuff) to access the data behind the id. In lucene you had to  
query the inverted index
that should be organized in a way that you can access it like the B- 
Tree. But the fact
that your database probably is something written in C makes file  
access with your
db a lot faster.

So as a conclusion. Use the db. At first the lucene index was not  
designed for queries like
the id query and secondly file access with your db should give you a  
better performance.

Am 01.10.2008 um 09:43 schrieb agatone:

> Hi,
> I asked this question already on "lucene-general" list but also got  
> advised
> to ask here too.
> I'm working on a project that has big database in the background (some
> tables have about 1500000 rows). We decided to use Lucene for "faster"
> search. Our search works similar as all searches: you write search  
> string,
> get list of hits with detail link. But there is dilemma if we should  
> store
> more data into index than it's needed.
> One side of developing team insists that we should use lucene index as
> somekind of storage for data so when you get hit, you go onto  
> details and
> then again use lucene to find document that matches the selected ID  
> and take
> the data from Lucene index. So in the end you end with copying  
> complete
> database tables into the lucene index.
> Other side insists on storing to index only data that is displayed  
> directly
> to the user when showing the search results list and needed for search
> criteria. When you go onto details, you have the matching ID so you  
> can
> pickup that row from database by that ID rather than search it  
> inside Lucene
> index.
> Can someone please describe drawbacks and advantages of both  
> approaches.
> Actually can someone write down what's the actual profit, where and  
> when of
> the Lucene itself in real production env.
> IT would be great if there is anyone who could write his experience  
> with
> indexing and searching large amount of data.
> Thank you
> -- 
> View this message in context:
> Sent from the Lucene - Java Users mailing list archive at
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

Sascha Fahl

evenity GmbH
Zu den Mühlen 19
D-35390 Gießen


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message