lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "PROYECTA.Fernandez Garcia, Ivan" <>
Subject Queries Lucene 1.3
Date Wed, 17 Nov 2004 14:49:11 GMT
Good afternoon everybody,

	First of all thanks for your attention.

	We are using Lucene1.3 api to index and search text in pdf files.
	We have two environment to develop with it: Windows, using Apache
Tomcat 5.0 and Sun Solaris, using Oracle Aplication Server.
	First we extract text pages from pdf file using Multivalent API
(this process seems run O.K.).
	Then we search text in new index created before. At this moment we
have the following problem:
		- If pdf file number page is 10, text is found.
		- If pdf file number page is more than 10, text is not
	We modify IndexWriter.minMergeDocs attribute assign two values:
Total number document pages and "1" value.
	In both cases:
		- if document is not big, index process seems run O.K. and
text search seems run O.K.
		- if document is big (600 pages), index process run K.O
raising OutofMemory exception.

	We send you our source code file where index a pdf file and search
text if you can see some error.
	We don´t know what more have we do with this problem.
	Can you help us , please?

Thanks you for your help.

 <<search_text.txt>>  <<index_lucene.txt>> 

> Iván Fernández García
> Proyecta Sistemas de Información
Outgoing mail is certified Virus Free.
Checked by AVG anti-virus system (
Version: 6.0.773 / Virus Database: 520 - Release Date: 05/10/2004

Has decidido el mejor precio.  Has decidido 
You´ve chosen the best price. You´ve chosen 

View raw message