lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Che Dong" <>
Subject [PLAN]: SAXIndexer, indexing database via XML gateway
Date Thu, 05 Jun 2003 16:55:01 GMT
In current weblucene project including a SAX Based xml source indexer:

It can parse  xml data source like following example: 
<?xml version="1.0" encoding="GB2312"?>
 <Record id="1">
  <Field name="Id">39314</Field>
  <Field name="Title">title of document</Field>
  <Field name="Author">chedong</Field>
  <Field name="Content">blah blah</Field>
  <Field name="PubTime">2003-06-06</Field>
  <Index name="FullIndex">Title,Content</Index>
  <Index name="TitleIndex" token="no">Author</Index>

I use two Index elements in  each Record block to speciefy field => index mapping, The
SAXIndexer will parse this xml source into Id, Title, Author, Content ,PubTime into Lucene
store only Fields and create another two index fields:
one index field with Title + Content 
one index field Author without token

Recently I notice more and more application provided xml interface very similar to RSS:
for example: you can even dump table into xml output from phpMyAdmin like following:
<?xml version="1.0" encoding="iso-8859-1"?>
  <!-- Table user -->

the SAXIndexer will be able to database xml dump directly if SAXIndexer can let specify field
=> index mapping rule from enternal program.
for example: 
java IndexRunner -c field_index_mapping.conf -i http://localhost/table_dump.xml

#the config file like following:
FullIndex       Title,Content 
AuthorIndex  Author          no

Hope this SAXIndexer can be added into Lucene demos make lucene end user can make lucene index
from current database applications.


Che, Dong
View raw message