manifoldcf-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From <karl.wri...@nokia.com>
Subject RE: RE: Beginner's question
Date Mon, 26 Jul 2010 08:26:10 GMT
Same problem, different place.
Fix checked in, and I've audited the webconnector code for this issue more thoroughly, so
you should not see it there again.  I'll be auditing other connectors as well.
Karl

________________________________________
From: ext c.a.r.e@gmx.de [c.a.r.e@gmx.de]
Sent: Monday, July 26, 2010 1:45 AM
To: connectors-user@incubator.apache.org
Subject: Re: RE: Beginner's question

Hi,

thanks a lot for fixing it. :)
When starting the job I receive a NPE in the lcf-logfiles.
-----------------------------------
[Startup thread] FATAL org.apache.lcf.crawlerthreads - Error tossed: null
java.lang.NullPointerException
    at java.io.StringReader.<init>(StringReader.java:33)
    at org.apache.lcf.crawler.connectors.webcrawler.WebcrawlerConnector.stringToArray(WebcrawlerConnector.java:6681)
    at org.apache.lcf.crawler.connectors.webcrawler.WebcrawlerConnector$DocumentURLFilter.<init>(WebcrawlerConnector.java:7158)
    at org.apache.lcf.crawler.connectors.webcrawler.WebcrawlerConnector.addSeedDocuments(WebcrawlerConnector.java:460)
    at org.apache.lcf.crawler.connectors.BaseRepositoryConnector.addSeedDocuments(BaseRepositoryConnector.java:243)
    at org.apache.lcf.crawler.system.StartupThread.run(StartupThread.java:184)
-----------------------------------
The seed I entered was sth like "www.apache.org" or "http://www.apache.org".

And some minor probs: after creating a webcrawler job and clicking "View" in the "List all
Jobs" tab or "Save" after having selected the "Edit" dialog I still receive an empty screen.

Carina

-------- Original-Nachricht --------
Datum: Fri, 23 Jul 2010 14:52:23 +0200
Von: karl.wright@nokia.com
An: connectors-user@incubator.apache.org
Betreff: RE: Beginner's question

Done.  r967081.


Karl




From: Wright Karl (Nokia-MS/Cambridge)
Sent: Friday, July 23, 2010 8:39 AM
To: connectors-user@incubator.apache.org<http://service.gmx.net/de/cgi/g.fcgi/mail/new?CUSTOMERNO=23657247&t=de598590807.1280123044.33016714&to=connectors-user%40incubator.apache.org>
Subject: RE: Beginner's question







It appears that work done for the API inadvertently broke the web connector UI.  I’ll check
a fix shortly.




Karl




From: Wright Karl (Nokia-MS/Cambridge)
Sent: Friday, July 23, 2010 8:32 AM
To: connectors-user@incubator.apache.org<http://service.gmx.net/de/cgi/g.fcgi/mail/new?CUSTOMERNO=23657247&t=de598590807.1280123044.33016714&to=connectors-user%40incubator.apache.org>
Subject: RE: Beginner's question







Your configuration looks reasonable.  Do you see any stack traces in either the LCF log, or
the tomcat log?




I’ll try the same thing here and see what happens.




Karl






From: ext c.a.r.e@gmx.de<http://service.gmx.net/de/cgi/g.fcgi/mail/new?CUSTOMERNO=23657247&t=de598590807.1280123044.33016714&to=c.a.r.e%40gmx.de>
[mailto:c.a.r.e@gmx.de]<http://service.gmx.net/de/cgi/g.fcgi/mail/new?CUSTOMERNO=23657247&t=de598590807.1280123044.33016714&to=c.a.r.e%40gmx.de>
Sent: Friday, July 23, 2010 8:27 AM
To: connectors-user@incubator.apache.org<http://service.gmx.net/de/cgi/g.fcgi/mail/new?CUSTOMERNO=23657247&t=de598590807.1280123044.33016714&to=connectors-user%40incubator.apache.org>
Subject: Re: Beginner's question







Hi,









I'm still having the problem I explained below:




When I create a new job choosing a web connector I receive an empty screen when clicking on
one of the other tabs (Scheduling etc.).




When selecting a Filesys Connector everything works fine.









I think I might have an error in my web connector configuration.














Name:Web Con Description:





________________________________
Connection type:Web Connector Max connections:10 Authority:None (global authority)





________________________________
Throttling:




Bin regular expression





Description





Max avg fetches/min





No throttles






________________________________
Email address:





mail@example.org<http://service.gmx.net/de/cgi/g.fcgi/mail/new?CUSTOMERNO=23657247&t=de598590807.1280123044.33016714&to=mail%40example.org>





Robots usage:





Obey robots.txt for all fetches





Bandwidth throttling:





Bin regular expression





Case insensitive?





Max connections





Max kbytes/sec





Max fetches/min





No bandwidth throttling






Page access credentials:





URL regular expression





Credential type





Credential domain





User name





No page access credentials






Session-based access credentials:





URL regular expression





Login pages





No session-based access credentials






Trust certificates:





URL regular expression





Certificate





No trust certificates







________________________________
Connection status:Connection working

Any ideas?
Carina





-------- Original-Nachricht --------
Datum: Wed, 21 Jul 2010 16:04:10 +0200
Von: Marc Emery <marco.emery@gmail.com><http://service.gmx.net/de/cgi/g.fcgi/mail/new?CUSTOMERNO=23657247&t=de598590807.1280123044.33016714&to=%26lt%3Bmarco.emery%40gmail.com>
An: connectors-user@incubator.apache.org<http://service.gmx.net/de/cgi/g.fcgi/mail/new?CUSTOMERNO=23657247&t=de598590807.1280123044.33016714&to=connectors-user%40incubator.apache.org>
Betreff: Re: Beginner's question





Hi,
It works, thanks a lot.

Cheers




2010/7/21 <karl.wright@nokia.com<http://service.gmx.net/de/cgi/g.fcgi/mail/new?CUSTOMERNO=23657247&t=de502027959.1279887901.d40f4f4&to=karl.wright%40nokia.com>>




Code has just been checked in which fixes this subtle but nasty bug.

Let me know what happens now. ;-)
Karl






-----Original Message-----
From: Wright Karl (Nokia-MS/Cambridge)
Sent: Wednesday, July 21, 2010 8:50 AM
To: connectors-user@incubator.apache.org<http://service.gmx.net/de/cgi/g.fcgi/mail/new?CUSTOMERNO=23657247&t=de502027959.1279887901.d40f4f4&to=connectors-user%40incubator.apache.org>









Subject: RE: Beginner's question

Well, that explains why your test isn't succeeding.

I think I've found the cause of the problem, however.  It is *indeed* the language default
used by Derby.  The following code is the problem:

>>>>>>
 protected LCFException reinterpretException(LCFException theException)
 {
   if (Logging.db.isDebugEnabled())
     Logging.db.debug("Reinterpreting exception '"+theException.getMessage()+"'.  The exception
type is "+Integer.toString(theException.getErrorCode()));
   if (theException.getErrorCode() != LCFException.DATABASE_CONNECTION_ERROR)
     return theException;
   Throwable e = theException.getCause();
   if (!(e instanceof java.sql.SQLException))
     return theException;
   if (Logging.db.isDebugEnabled())
     Logging.db.debug("Exception "+theException.getMessage()+" is possibly a transaction abort
signal");
   String message = e.getMessage();
   if (message.indexOf("due to a deadlock") != -1)
     return new LCFException(message,e,LCFException.DATABASE_TRANSACTION_ABORT);
   // Note well: We also have to treat 'duplicate key' as a transaction abort, since this
is what you get when two threads attempt to
   // insert the same row.  (Everything only works, then, as long as there is a unique constraint
corresponding to every bad insert that
   // one could make.)
   if (message.indexOf("duplicate key") != -1)
     return new LCFException(message,e,LCFException.DATABASE_TRANSACTION_ABORT);
   if (Logging.db.isDebugEnabled())
     Logging.db.debug("Exception "+theException.getMessage()+" is NOT a transaction abort
signal");
   return theException;
 }
<<<<<<

It looks like Derby has a specific exception class instead for these kinds of exceptions,
so I will be able to test them directly rather than look at text.  Stay tuned.

Karl




-----Original Message-----
From: ext c.a.r.e@gmx.de<http://service.gmx.net/de/cgi/g.fcgi/mail/new?CUSTOMERNO=23657247&t=de502027959.1279887901.d40f4f4&to=c.a.r.e%40gmx.de>
[mailto:c.a.r.e@gmx.de<http://service.gmx.net/de/cgi/g.fcgi/mail/new?CUSTOMERNO=23657247&t=de502027959.1279887901.d40f4f4&to=c.a.r.e%40gmx.de>]
Sent: Wednesday, July 21, 2010 8:25 AM
To: connectors-user@incubator.apache.org<http://service.gmx.net/de/cgi/g.fcgi/mail/new?CUSTOMERNO=23657247&t=de502027959.1279887901.d40f4f4&to=connectors-user%40incubator.apache.org>
Subject: Re: Beginner's question

Hi,

I'm getting the same exception as Marc except that on my machine it's German text ;o)
I tried it first with jdk 1.6_13, then updated to 1.6_21 based on a new SVN Update. But I
haven't been successful yet.

Carina


-------- Original-Nachricht --------
> Datum: Wed, 21 Jul 2010 12:13:22 +0200
> Von: karl.wright@nokia.com<http://service.gmx.net/de/cgi/g.fcgi/mail/new?CUSTOMERNO=23657247&t=de502027959.1279887901.d40f4f4&to=karl.wright%40nokia.com>
> An: connectors-user@incubator.apache.org<http://service.gmx.net/de/cgi/g.fcgi/mail/new?CUSTOMERNO=23657247&t=de502027959.1279887901.d40f4f4&to=connectors-user%40incubator.apache.org>
> Betreff: Re: Beginner\'s question

> I'm definitely not seeing this behavior here, with sun jdk 1.6.  It's
> worth getting to the bottom of.
>
> Can you do the following:
>
> (1)     Svn co a completely fresh version of LCF
> (2)     Ant, making sure ant is actually using jdk 1.6
>
> If you *still* get this problem, please let me know.  It's not clear what
> the difference is, but there's got to be a difference somewhere.  I hope it
> is not how Derby works on French machines. ;-)
>
> Karl
>
>
> >>>>>>
> Worker thread aborting and restarting due to database connection reset:
> Database exception: Exception doing query: L'instruction a été abandonnée
> parce qu'elle aurait entraîné la duplication d'une valeur de clé dans
> une contrainte de clé ou d'index unique identifié par 'I1279701064805'
> définie sur 'INGESTSTATUS'.
> org.apache.lcf.core.interfaces.LCFException: Database exception: Exception
> doing query: L'instruction a été abandonnée parce qu'elle aurait
> entraîné la duplication d'une valeur de clé dans une contrainte de clé ou
> d'index unique identifié par 'I1279701064805' définie sur 'INGESTSTATUS'.
>     at
> org.apache.lcf.core.database.Database.executeViaThread(Database.java:421)
>     at
> org.apache.lcf.core.database.Database.executeUncachedQuery(Database.java:449)
>     at
> org.apache.lcf.core.database.Database$QueryCacheExecutor.create(Database.java:1072)
>     at
> org.apache.lcf.core.cachemanager.CacheManager.findObjectsAndExecute(CacheManager.java:144)
>     at
> org.apache.lcf.core.database.Database.executeQuery(Database.java:167)
>     at
> org.apache.lcf.core.database.DBInterfaceDerby.performModification(DBInterfaceDerby.java:615)
>     at
> org.apache.lcf.core.database.DBInterfaceDerby.performInsert(DBInterfaceDerby.java:177)
>     at
> org.apache.lcf.core.database.BaseTable.performInsert(BaseTable.java:76)
>     at
> org.apache.lcf.agents.incrementalingest.IncrementalIngester.noteDocumentIngest(IncrementalIngester.java:1267)
>     at
> org.apache.lcf.agents.incrementalingest.IncrementalIngester.performIngestion(IncrementalIngester.java:410)
>     at
> org.apache.lcf.agents.incrementalingest.IncrementalIngester.documentIngest(IncrementalIngester.java:304)
>     at
> org.apache.lcf.crawler.system.WorkerThread$ProcessActivity.ingestDocument(WorkerThread.java:1586)
>     at
> org.apache.lcf.crawler.connectors.filesystem.FileConnector.processDocuments(FileConnector.java:275)
>     at
> org.apache.lcf.crawler.connectors.BaseRepositoryConnector.processDocuments(BaseRepositoryConnector.java:516)
>     at
> org.apache.lcf.crawler.system.WorkerThread.run(WorkerThread.java:585)
> Caused by: java.sql.SQLIntegrityConstraintViolationException:
> L'instruction a été abandonnée parce qu'elle aurait entraîné la duplication d'une
> valeur de clé dans une contrainte de clé ou d'index unique identifié par
> 'I1279701064805' définie sur 'INGESTSTATUS'.
>     at
> org.apache.derby.impl.jdbc.SQLExceptionFactory40.getSQLException(Unknown Source)
>     at org.apache.derby.impl.jdbc.Util.generateCsSQLException(Unknown
> Source)
>     at
> org.apache.derby.impl.jdbc.TransactionResourceImpl.wrapInSQLException(Unknown Source)
>     at
> org.apache.derby.impl.jdbc.TransactionResourceImpl.handleException(Unknown Source)
>     at org.apache.derby.impl.jdbc.EmbedConnection.handleException(Unknown
> Source)
>     at org.apache.derby.impl.jdbc.ConnectionChild.handleException(Unknown
> Source)
>     at org.apache.derby.impl.jdbc.EmbedStatement.executeStatement(Unknown
> Source)
>     at
> org.apache.derby.impl.jdbc.EmbedPreparedStatement.executeStatement(Unknown Source)
>     at
> org.apache.derby.impl.jdbc.EmbedPreparedStatement.executeUpdate(Unknown Source)
>     at org.apache.lcf.core.database.Database.execute(Database.java:566)
>     at
> org.apache.lcf.core.database.Database$ExecuteQueryThread.run(Database.java:381)
> Caused by: java.sql.SQLException: L'instruction a été abandonnée parce
> qu'elle aurait entraîné la duplication d'une valeur de clé dans une
> contrainte de clé ou d'index unique identifié par 'I1279701064805' définie
> sur 'INGESTSTATUS'.
>     at
> org.apache.derby.impl.jdbc.SQLExceptionFactory.getSQLException(Unknown Source)
>     at
> org.apache.derby.impl.jdbc.SQLExceptionFactory40.wrapArgsForTransportAcrossDRDA(Unknown
Source)
>     ... 11 more
>
> However i can start jetty and get the ui working.
>
> Thanks
> marc
> <<<<<<
>
>

--
GMX DSL: Internet-, Telefon- und Handy-Flat ab 19,99 EUR/mtl.
Bis zu 150 EUR Startguthaben inklusive! http://portal.gmx.net/de/go/dsl












--
Neu: GMX De-Mail - Einfach wie E-Mail, sicher wie ein Brief!
Jetzt De-Mail-Adresse reservieren: http://portal.gmx.net/de/go/demail







--
GRATIS für alle GMX-Mitglieder: Die maxdome Movie-FLAT!
Jetzt freischalten unter http://portal.gmx.net/de/go/maxdome01

Mime
View raw message