lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Spencer, Dave" <>
Subject RE: Indexing Db Table -- Better way request
Date Sat, 09 Nov 2002 00:45:40 GMT
We have a number of internal systems here (content mgmt, bug db, support
CRM), all of which are PHP/MySQL combos - and in all cases Lucene is
used for the
indexing and we have never seen any reason to go to XML
as in intermediate step. We've been at this for 6 months or so.
Only hassle is that if the group that's doing the PHP/MySQL tweaks the
they have to remember to modify the Lucene indexer so that, say, it
up the new columns - but there's no way around this unless you want to 
be very generic, in which case xml still doesn't give you anything since
you could
just as well use JDBC meta-data to get all columns...

-----Original Message-----
From: Michael Caughey []
Sent: Friday, November 08, 2002 4:21 PM
To: Spencer, Dave; Lucene Users List
Subject: Re: Indexing Db Table -- Better way request

Converting straight to a document seemed to me the best answer as I
to investigate.  Somewhere along the line I thought I remembered seeing
suggestion that it was for some reason better to convert to XML and then
it as an XML document.  I'd rather not have the hassel of creating then
later parsing the XML.  I could not find the reference again.  This in
was what I was hoping to hear.

----- Original Message -----
From: "Spencer, Dave" <>
To: "Lucene Users List" <>
Cc: <>
Sent: Friday, November 08, 2002 6:59 PM
Subject: RE: Indexing Db Table -- Better way request

One small comment: what's the point of converting a row to XML?
What I think you want to do is convert a row to a Document and then
pass that off to IndexWriter.

-----Original Message-----
From: Caughey, Michael []
Sent: Friday, November 08, 2002 2:22 PM
To: ''
Cc: ''
Subject: Indexing Db Table -- Better way request


I'm new to Lucene and this group, if it is improper to send such a
to this group I apologize.  I tried to do a reasonable amount of up
research before coming here.

I'm about to undertake a piece of my project where I've decided that
will be of use.  I have been researching, over the past two week's, ways
accomplish this.  I know I'll use an indexWriter to write the index to a
file, but I'm having difficultly settling on how to process the data to

What I have is a table in a MySQL database called items.  I want to be
to search on a couple of fields and have it return the ID:
Name VARCHAR (80)
Description TEXT
Location VARCHAR (80)
Qty int
ExpireDate Long YYYYMMDD
Category int
ListingPrice FLOAT(9,2)
Supplier int

ItemId int

On start up of the application every row in the database will be read.
After that I need to keep the table and the index in sync.  Data in the
columns can change, rows can be added and removed.  I have a centeral
controller which is responsible for all access to that table.

I figured on approach which would work would be on start up to read each
and build an XML document and submit it to the IndexWriter.
As Inserts, Deletes and updates occurred I could modify both lucene and

Seems simple enough, and may be the only way to handle it.  Before I did
I wanted to make sure that there wasn't a better way.
Are there documents which can automatically read the table and build a
Should I read the row and just build fields and construct a document?

Does anyone see any problems with storing it in memory versus writing it
a file?  Or should I say at point would you consider writing it to a
would you base that on total document size?  I feel that a file index
most likely be just fine.

Thanks in advance for any suggestions.

Michael Caughey

To unsubscribe, e-mail:   <>
For additional commands, e-mail: <>

View raw message