manifoldcf-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Karl Wright <daddy...@gmail.com>
Subject Re: org.hsqldb.HsqlException: java.lang.NegativeArraySizeException
Date Wed, 16 Mar 2016 16:03:15 GMT
Hi Ian,

The database size seems way too big for this crawl size.  I've not seen
this problem before but I suspect that whatever is causing the bloat is
also causing HSQLDB to fail.

Can you give me further details about what repository connections you are
using?  It is possible that there's a heretofore unknown pathological case
you are running into during the crawl.  Are there any custom connectors
involved?

If we rule out a bug of some kind, then the next thing to do would be to go
to a real database, e.g. PostgreSQL.

Karl


On Wed, Mar 16, 2016 at 11:04 AM, Ian Zapczynski <
Ian.Zapczynski@veritablelp.com> wrote:

> Hello,
>
> We've had ManifoldCF 2.0.1 working well with SOLR for months on Windows
> 2012 using the single process model.   We recently just noticed that new
> documents are not getting ingested, even after restarting the job, the
> server, etc.   What I see in the logs are first a bunch of 500 errors
> coming out of SOLR as a result of ManifoldCF trying to index .tif files
> that are found in the directory structure being indexed.   After that (not
> sure if related or not), I see a bunch of these errors:
>
> FATAL 2016-03-15 16:01:48,801 (Thread-1387745) -
> C:\apache-manifoldcf-2.0.1\example\.\./dbname.data getFromFile failed
> 33337202
> org.hsqldb.HsqlException: java.lang.NegativeArraySizeException
>  at org.hsqldb.error.Error.error(Unknown Source)
>  at org.hsqldb.persist.DataFileCache.getFromFile(Unknown Source)
>  at org.hsqldb.persist.DataFileCache.get(Unknown Source)
>  at org.hsqldb.persist.RowStoreAVLDisk.get(Unknown Source)
>  at org.hsqldb.index.NodeAVLDisk.findNode(Unknown Source)
>  at org.hsqldb.index.NodeAVLDisk.getRight(Unknown Source)
>  at org.hsqldb.index.IndexAVL.next(Unknown Source)
>  at org.hsqldb.index.IndexAVL.next(Unknown Source)
>  at org.hsqldb.index.IndexAVL$IndexRowIterator.getNextRow(Unknown Source)
>  at org.hsqldb.RangeVariable$RangeIteratorMain.findNext(Unknown Source)
>  at org.hsqldb.RangeVariable$RangeIteratorMain.next(Unknown Source)
>  at org.hsqldb.QuerySpecification.buildResult(Unknown Source)
>  at org.hsqldb.QuerySpecification.getSingleResult(Unknown Source)
>  at org.hsqldb.QuerySpecification.getResult(Unknown Source)
>  at org.hsqldb.StatementQuery.getResult(Unknown Source)
>  at org.hsqldb.StatementDMQL.execute(Unknown Source)
>  at org.hsqldb.Session.executeCompiledStatement(Unknown Source)
>  at org.hsqldb.Session.execute(Unknown Source)
>  at org.hsqldb.jdbc.JDBCPreparedStatement.fetchResult(Unknown Source)
>  at org.hsqldb.jdbc.JDBCPreparedStatement.executeQuery(Unknown Source)
>  at org.apache.manifoldcf.core.database.Database.execute(Database.java:889)
>  at
> org.apache.manifoldcf.core.database.Database$ExecuteQueryThread.run(Database.java:683)
> Caused by: java.lang.NegativeArraySizeException
>  at org.hsqldb.lib.StringConverter.readUTF(Unknown Source)
>  at org.hsqldb.rowio.RowInputBinary.readString(Unknown Source)
>  at org.hsqldb.rowio.RowInputBinary.readChar(Unknown Source)
>  at org.hsqldb.rowio.RowInputBase.readData(Unknown Source)
>  at org.hsqldb.rowio.RowInputBinary.readData(Unknown Source)
>  at org.hsqldb.rowio.RowInputBase.readData(Unknown Source)
>  at org.hsqldb.rowio.RowInputBinary.readData(Unknown Source)
>  at org.hsqldb.rowio.RowInputBinaryDecode.readData(Unknown Source)
>  at org.hsqldb.RowAVLDisk.<init>(Unknown Source)
>  at org.hsqldb.persist.RowStoreAVLDisk.get(Unknown Source)
>  ... 21 more
> ERROR 2016-03-15 16:01:48,911 (Stuffer thread) - Stuffer thread aborting
> and restarting due to database connection reset: Database exception:
> SQLException doing query (S1000): java.lang.NegativeArraySizeException
> org.apache.manifoldcf.core.interfaces.ManifoldCFException: Database
> exception: SQLException doing query (S1000):
> java.lang.NegativeArraySizeException
>  at
> org.apache.manifoldcf.core.database.Database$ExecuteQueryThread.finishUp(Database.java:702)
>  at
> org.apache.manifoldcf.core.database.Database.executeViaThread(Database.java:728)
>  at
> org.apache.manifoldcf.core.database.Database.executeUncachedQuery(Database.java:771)
>  at
> org.apache.manifoldcf.core.database.Database$QueryCacheExecutor.create(Database.java:1444)
>  at
> org.apache.manifoldcf.core.cachemanager.CacheManager.findObjectsAndExecute(CacheManager.java:146)
>  at
> org.apache.manifoldcf.core.database.Database.executeQuery(Database.java:191)
>  at
> org.apache.manifoldcf.core.database.DBInterfaceHSQLDB.performQuery(DBInterfaceHSQLDB.java:916)
>  at
> org.apache.manifoldcf.core.database.BaseTable.performQuery(BaseTable.java:221)
>  at
> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.getPipelineDocumentIngestDataChunk(IncrementalIngester.java:1783)
>  at
> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.getPipelineDocumentIngestDataMultiple(IncrementalIngester.java:1748)
>  at
> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.getPipelineDocumentIngestDataMultiple(IncrementalIngester.java:1703)
>  at
> org.apache.manifoldcf.crawler.system.StufferThread.run(StufferThread.java:254)
> Caused by: java.sql.SQLException: java.lang.NegativeArraySizeException
>  at org.hsqldb.jdbc.JDBCUtil.sqlException(Unknown Source)
>  at org.hsqldb.jdbc.JDBCUtil.sqlException(Unknown Source)
>  at org.hsqldb.jdbc.JDBCPreparedStatement.fetchResult(Unknown Source)
>  at org.hsqldb.jdbc.JDBCPreparedStatement.executeQuery(Unknown Source)
>  at org.apache.manifoldcf.core.database.Database.execute(Database.java:889)
>  at
> org.apache.manifoldcf.core.database.Database$ExecuteQueryThread.run(Database.java:683)
> Caused by: org.hsqldb.HsqlException: java.lang.NegativeArraySizeException
>
> After these errors occur, the job just seems to hang and not process any
> further documents or log anything more in the manifoldcf.log.   So I see
> the error is coming out of the HyperSQL database, but I don't know why.
> There is sufficient disk space.   Now the database file is 33 Gb (larger
> than I'd expect for our ~110,000 documents), but I haven't seen
> any evidence that we're hitting a limit on file size.   I'm afraid I'm not
> sure where to go from here to further nail down the problem.
>
> As always, any and all help is much appreciated.
>
> Thanks,
>
> -Ian
>

Mime
View raw message