manifoldcf-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Olivier Tavard <olivier.tav...@francelabs.com>
Subject Re: Job error during WindowsShare repository connector indexation
Date Wed, 11 Oct 2017 09:54:29 GMT
Hi,

Thanks for your answer.
Yes I could reach the samba server from the MCF server. Indeed, the first hours after the
MCF job was launched, thousands of documents were correctly accessed and processed by MCF.
The mentioned errors appeared only after few hours. Before that, the indexation was done correctly.

Best regards,
Olivier TAVARD


> Le 11 oct. 2017 à 11:21, Cihad Guzel <cguzelg@gmail.com> a écrit :
> 
> Hi Olivier,
> 
> Did you try to connect to samba server with any samba client app? Check Iptables on your
server. Can you stop iptables on ubuntu server? Maybe, you can configure iptables.
> 
> Regards,
> Cihad Guzel
> 
> 
> 2017-10-11 12:02 GMT+03:00 Olivier Tavard <olivier.tavard@francelabs.com <mailto:olivier.tavard@francelabs.com>>:
> Hi,
> 
> I had this error during crawling a Samba hosted on Ubuntu Server :
> ERROR 2017-10-05 00:00:14,109 (Idle cleanup thread) - MCF|MCF-agent|apache.manifoldcf.crawlerthreads|Exception
tossed: Service '_ANON_0' of type '_REPOSITORYCONNECTORPOOL_SmbFileShare' is not active
> org.apache.manifoldcf.core.interfaces.ManifoldCFException: Service '_ANON_0' of type
'_REPOSITORYCONNECTORPOOL_SmbFileShare' is not active
> at org.apache.manifoldcf.core.lockmanager.BaseLockManager.updateServiceData(BaseLockManager.java:273)
> at org.apache.manifoldcf.core.lockmanager.LockManager.updateServiceData(LockManager.java:108)
> at org.apache.manifoldcf.core.connectorpool.ConnectorPool$Pool.pollAll(ConnectorPool.java:654)
> at org.apache.manifoldcf.core.connectorpool.ConnectorPool.pollAllConnectors(ConnectorPool.java:338)
> at org.apache.manifoldcf.crawler.repositoryconnectorpool.RepositoryConnectorPool.pollAllConnectors(RepositoryConnectorPool.java:124)
> at org.apache.manifoldcf.crawler.system.IdleCleanupThread.run(IdleCleanupThread.java:68)
> 
> I used MCF 2.8.1 on Debian 8 with Postgresql 9.5.3, Windows Share repository connector.
The job was configured to process about 2 millions of files  (600 GB). 
> For text extraction I used a Tika server (on the same server as MCF) and add the Tika
external content extractor transformation connector into the job configuration.
> The error was present 9 hours after the job was launched. The status job still indicated
that the job was running but there was only 1 document in the active column and the error
above was repeated in the MCF log.
> 
> Then I tried to launch the clean-lock.sh script and I obtained this error :
> WARN 2017-10-09 08:23:56,284 (Idle cleanup thread) - MCF|MCF-agent|apache.manifoldcf.lock|Attempt
to set file lock 'mcf/mcf_home/./syncharea/551/442/lock-_POOLTARGET__REPOSITORYCONNECTORPOOL_SmbFileShare.lock'
failed: No such file or directory
> java.io.IOException: No such file or directory
> at java.io.UnixFileSystem.createFileExclusively(Native Method)
> at java.io.File.createNewFile(File.java:1012)
> at org.apache.manifoldcf.core.lockmanager.FileLockObject.grabFileLock(FileLockObject.java:223)
> at org.apache.manifoldcf.core.lockmanager.FileLockObject.obtainGlobalWriteLockNoWait(FileLockObject.java:78)
> at org.apache.manifoldcf.core.lockmanager.LockObject.obtainGlobalWriteLock(LockObject.java:121)
> at org.apache.manifoldcf.core.lockmanager.LockObject.enterWriteLock(LockObject.java:74)
> at org.apache.manifoldcf.core.lockmanager.LockGate.enterWriteLock(LockGate.java:177)
> at org.apache.manifoldcf.core.lockmanager.BaseLockManager.enterWrite(BaseLockManager.java:1120)
> at org.apache.manifoldcf.core.lockmanager.BaseLockManager.enterWriteLock(BaseLockManager.java:757)
> at org.apache.manifoldcf.core.lockmanager.LockManager.enterWriteLock(LockManager.java:302)
> at org.apache.manifoldcf.core.connectorpool.ConnectorPool$Pool.pollAll(ConnectorPool.java:585)
> at org.apache.manifoldcf.core.connectorpool.ConnectorPool.pollAllConnectors(ConnectorPool.java:338)
> at org.apache.manifoldcf.crawler.repositoryconnectorpool.RepositoryConnectorPool.pollAllConnectors(RepositoryConnectorPool.java:124)
> at org.apache.manifoldcf.crawlerui.IdleCleanupThread.run(IdleCleanupThread.java:69)
> And the error was repeated indefinitely in the log.
> 
> Did it mean that there was a problem with the syncharea folder at some point ?
> 
> Thanks,
> Best regards,
> 
> Olivier TAVARD
> 
> 
> 
> -- 
> Cihad Güzel


Mime
View raw message