manifoldcf-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Priya Arora <pr...@smartshore.nl>
Subject Re: Manifoldcf server Error
Date Mon, 23 Dec 2019 08:51:30 GMT
Hi All,

I am facing below error while executing jobs on manifoldcf.
I had implemented all performance tuning parameters of postgres as
suggested on Manifoldcf documentation.
Can anybody suggests something on this- reason, solution of this.

[image: image.png]
[image: image.png]

Parmeters configured fo rtuning is
[image: image.png]
Thanks
Priya

On Fri, Dec 20, 2019 at 4:43 PM Priya Arora <priya@smartshore.nl> wrote:

> Hi Markus,
> Yes I also tried to start/restart the container with docker commands , it
> also results in error.
> Here is the complete stacktrace of logs:-
> [image: image.png]
>
> On Fri, Dec 20, 2019 at 4:39 PM Markus Schuch <markus_schuch@web.de>
> wrote:
>
>> Hi Priya,
>>
>> the container you trying to interactivily executing a command with is no
>> longer running. It is not possible to execute command with stopped
>> containers.
>>
>> The logger issues might be related to missing file system permissions.
>> But thats a wild guess. Is there a "Caused by" part in the stacktrace of
>> the AppenderLoggingException?
>>
>> Cheers,
>> Markus
>>
>> Am 20.12.2019 um 11:48 schrieb Priya Arora:
>> > Hi All,
>> >
>> > When i am trying to execute bash command inside manifoldcf container
>> > getting error.
>> > image.png
>> > And when checking logs Sudo docker logs <CID>
>> > 2019-12-19 18:09:05,848 Job start thread ERROR Unable to write to stream
>> > logs/manifoldcf.log for appender MyFile
>> > 2019-12-19 18:09:05,848 Seeding thread ERROR Unable to write to stream
>> > logs/manifoldcf.log for appender MyFile
>> > 2019-12-19 18:09:05,848 Job reset thread ERROR Unable to write to stream
>> > logs/manifoldcf.log for appender MyFile
>> > 2019-12-19 18:09:05,848 Job notification thread ERROR Unable to write to
>> > stream logs/manifoldcf.log for appender MyFile
>> > 2019-12-19 18:09:05,849 Seeding thread ERROR An exception occurred
>> > processing Appender MyFile org
>> >  .apache.logging.log4j.core.appender.AppenderLoggingException: Error
>> > flushing stream logs/manifoldcf.log
>> >         at
>> >
>> org.apache.logging.log4j.core.appender.OutputStreamManager.flush(OutputStreamManager.java:159)
>> >
>> > Can any body suggest reason behind this error?
>> >
>> > Thanks
>> > Priya
>> >
>> > On Fri, Dec 20, 2019 at 3:37 PM Priya Arora <priya@smartshore.nl
>> > <mailto:priya@smartshore.nl>> wrote:
>> >
>> >     Hi Markus,
>> >
>> >     Many thanks for your reply!!.
>> >
>> >     I tried this approach to reproduce the scenario in a different
>> >     environment, but the case  where I listed the error above is when I
>> >     am crawling INTRANET sites which can be accessible over a remote
>> >     server. Also I have used Transformation connectors:-Allow Documents,
>> >     Tika Parser, Content Limiter( 10000000), Metadata Adjuster.
>> >
>> >     When tried reproducing the error with Public sites of the same
>> >     domain and on a different server(DEV), it was successful, with no
>> >     error.Also there was no any postgres related error.
>> >
>> >     Can it depends observer related configurations like Firewall etc, as
>> >     this case include some firewall,security related configurations.
>> >
>> >     Thanks
>> >     Priya
>> >
>> >
>> >
>> >
>> >     On Fri, Dec 20, 2019 at 3:23 PM Markus Schuch <markus_schuch@web.de
>> >     <mailto:markus_schuch@web.de>> wrote:
>> >
>> >         Hi Priya,
>> >
>> >         in my experience, i would focus on the OutOfMemoryError (OOME).
>> >         8 Gigs can be enough, but they don't have to.
>> >
>> >         At first i would check if the jvm is really getting the desired
>> heap
>> >         size. The dockered environment make that a little harder find
>> >         find out,
>> >         since you need to get access to the jvm metrics, e.g. via
>> jmxremote.
>> >         Beeing able to monitor the jvm metrics helps you with
>> >         correlating the
>> >         errors with the heap and garbage collection activity.
>> >
>> >         The errors you see on postgresql jdbc driver might be very
>> >         related to
>> >         the OOME.
>> >
>> >         Some question i would ask myself:
>> >
>> >         Do the problems repeatingly occur only when crawling this
>> specific
>> >         content source or only with this specific output connection?
>> Can you
>> >         reproduce it outside of docker in a controlled dev environment?
>> >         Or is it
>> >         a more general problem with your manifoldcf instance?
>> >
>> >         May be there are some huge files beeing crawled in your content
>> >         source?
>> >         To you have any kind of transformations configured? (e.g.
>> >         content size
>> >         limit?) You should try to see in the job's history if there are
>> any
>> >         patterns, like the error rises always after encountering the
>> same
>> >         document xy.
>> >
>> >         Cheers
>> >         Markus
>> >
>> >
>> >
>> >         Am 20.12.2019 um 09:59 schrieb Priya Arora:
>> >         > Hi  Markus ,
>> >         >
>> >         > Heap size defined is 8GB. Manifoldcf start-options-unix file
>> >         Xmx etc
>> >         > parameters is defined to have memory 8192mb.
>> >         >
>> >         > It seems to be an issue with memory also, and also when
>> >         manifoldcf tries
>> >         > to communicate to Database. Do you explicitly define somewhere
>> >         > connection timer when to communicate to postgres.
>> >         > Postgres is installed as a part of docker image pull and then
>> some
>> >         > changes in properties.xml(of manifoldcf) to connect to
>> database.
>> >         > On the other hand Elastic search is also holding sufficient
>> >         memory and
>> >         > Manifoldcf is also provided with 8 cores CPU.
>> >         >
>> >         > Can you suggest some solution.
>> >         >
>> >         > Thanks
>> >         > Priya
>> >         >
>> >         > On Fri, Dec 20, 2019 at 2:23 PM Markus Schuch
>> >         <markus_schuch@web.de <mailto:markus_schuch@web.de>
>> >         > <mailto:markus_schuch@web.de <mailto:markus_schuch@web.de>>>
>> >         wrote:
>> >         >
>> >         >     Hi Priya,
>> >         >
>> >         >     your manifoldcf JVM suffers from high garbage collection
>> >         pressure:
>> >         >
>> >         >         java.lang.OutOfMemoryError: GC overhead limit exceeded
>> >         >
>> >         >     What is your current heap size?
>> >         >     Without knowing that, i suggest to increase the heap size.
>> >         (java
>> >         >     -Xmx...)
>> >         >
>> >         >     Cheers,
>> >         >     Markus
>> >         >
>> >         >     Am 20.12.2019 um 09:02 schrieb Priya Arora:
>> >         >     > Hi All,
>> >         >     >
>> >         >     > I am facing below error while accessing Manifoldcf.
>> >         Requirement is to
>> >         >     > crawl data from a website using Repository as "Web" and
>> >         Output
>> >         >     connector
>> >         >     > as "Elastic Search"
>> >         >     > Manifoldcf is configured inside a docker container and
>> also
>> >         >     postgres is
>> >         >     > used a docker container.
>> >         >     > When launching manifold getting below error
>> >         >     > image.png
>> >         >     >
>> >         >     > When checked logs:-
>> >         >     > *1)sudo docker exec -it 0b872dfafc5c tail -1000
>> >         >     > /usr/share/manifoldcf/example/logs/manifoldcf.log*
>> >         >     > FATAL 2019-12-20T06:06:13,176 (Stuffer thread) - Error
>> >         tossed: Timer
>> >         >     > already cancelled.
>> >         >     > java.lang.IllegalStateException: Timer already
>> cancelled.
>> >         >     >         at java.util.Timer.sched(Timer.java:397)
>> >         ~[?:1.8.0_232]
>> >         >     >         at java.util.Timer.schedule(Timer.java:193)
>> >         ~[?:1.8.0_232]
>> >         >     >         at
>> >         >     >
>> >
>>  org.postgresql.jdbc.PgConnection.addTimerTask(PgConnection.java:1113)
>> >         >     > ~[postgresql-42.1.3.jar:42.1.3]
>> >         >     >         at
>> >         >     >
>> >         org.postgresql.jdbc.PgStatement.startTimer(PgStatement.java:887)
>> >         >     > ~[postgresql-42.1.3.jar:42.1.3]
>> >         >     >         at
>> >         >     >
>> >
>>  org.postgresql.jdbc.PgStatement.executeInternal(PgStatement.java:427)
>> >         >     > ~[postgresql-42.1.3.jar:42.1.3]
>> >         >     >         at
>> >         >
>>  org.postgresql.jdbc.PgStatement.execute(PgStatement.java:354)
>> >         >     > ~[postgresql-42.1.3.jar:42.1.3]
>> >         >     >         at
>> >         >     >
>> >         >
>> >
>>   org.postgresql.jdbc.PgPreparedStatement.executeWithFlags(PgPreparedStatement.java:169)
>> >         >     > ~[postgresql-42.1.3.jar:42.1.3]
>> >         >     >         at
>> >         >     >
>> >         >
>> >
>>   org.postgresql.jdbc.PgPreparedStatement.executeUpdate(PgPreparedStatement.java:136)
>> >         >     > ~[postgresql-42.1.3.jar:42.1.3]
>> >         >     >         at
>> >         >     >
>> >         org.postgresql.jdbc.PgConnection.isValid(PgConnection.java:1311)
>> >         >     > ~[postgresql-42.1.3.jar:42.1.3]
>> >         >     >         at
>> >         >     >
>> >         >
>> >
>>   org.apache.manifoldcf.core.jdbcpool.ConnectionPool.getConnection(ConnectionPool.java:92)
>> >         >     > ~[mcf-core.jar:?]
>> >         >     >         at
>> >         >     >
>> >         >
>> >
>>   org.apache.manifoldcf.core.database.ConnectionFactory.getConnectionWithRetries(ConnectionFactory.java:126)
>> >         >     > ~[mcf-core.jar:?]
>> >         >     >         at
>> >         >     >
>> >         >
>> >
>>   org.apache.manifoldcf.core.database.ConnectionFactory.getConnection(ConnectionFactory.java:75)
>> >         >     > ~[mcf-core.jar:?]
>> >         >     >         at
>> >         >     >
>> >         >
>> >
>>   org.apache.manifoldcf.core.database.Database.executeUncachedQuery(Database.java:797)
>> >         >     > ~[mcf-core.jar:?]
>> >         >     >         at
>> >         >     >
>> >         >
>> >
>>   org.apache.manifoldcf.core.database.Database$QueryCacheExecutor.create(Database.java:1457)
>> >         >     > ~[mcf-core.jar:?]
>> >         >     >         at
>> >         >     >
>> >         >
>> >
>>   org.apache.manifoldcf.core.cachemanager.CacheManager.findObjectsAndExecute(CacheManager.java:146)
>> >         >     > ~[mcf-core.jar:?]
>> >         >     >         at
>> >         >     >
>> >         >
>> >
>>   org.apache.manifoldcf.core.database.Database.executeQuery(Database.java:204)
>> >         >     > ~[mcf-core.jar:?]
>> >         >     >         at
>> >         >     >
>> >         >
>> >
>>   org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.performQuery(DBInterfacePostgreSQL.java:837)
>> >         >     > ~[mcf-core.jar:?]
>> >         >     >         at
>> >         >     >
>> >         >
>> >
>>   org.apache.manifoldcf.core.database.BaseTable.performQuery(BaseTable.java:221)
>> >         >     > ~[mcf-core.jar:?]
>> >         >     >         at
>> >         >     >
>> >         >
>> >          org.apache.manifoldcf.crawler.jobs.Jobs
>> .getActiveJobConnections(Jobs.java:736)
>> >         >     > ~[mcf-pull-agent.jar:?]
>> >         >     >         at
>> >         >     >
>> >         >
>> >
>>   org.apache.manifoldcf.crawler.jobs.JobManager.getNextDocuments(JobManager.java:2869)
>> >         >     > ~[mcf-pull-agent.jar:?]
>> >         >     >         at
>> >         >     >
>> >         >
>> >
>>   org.apache.manifoldcf.crawler.system.StufferThread.run(StufferThread.java:186)
>> >         >     > [mcf-pull-agent.jar:?]
>> >         >     > *2)sudo docker logs <CID> --tail 1000*
>> >         >     > Exception in thread "PostgreSQL-JDBC-SharedTimer-1"
>> >         >     > java.lang.OutOfMemoryError: GC overhead limit exceeded
>> >         >     >         at
>> java.util.ArrayList.iterator(ArrayList.java:840)
>> >         >     >         at
>> >         >     >
>> >         >
>> >
>>   java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1316)
>> >         >     >         at
>> >         java.net.InetAddress.getAllByName0(InetAddress.java:1277)
>> >         >     >         at
>> >         java.net.InetAddress.getAllByName(InetAddress.java:1193)
>> >         >     >         at
>> >         java.net.InetAddress.getAllByName(InetAddress.java:1127)
>> >         >     >         at
>> >         java.net.InetAddress.getByName(InetAddress.java:1077)
>> >         >     >         at
>> >         >
>>  java.net.InetSocketAddress.<init>(InetSocketAddress.java:220)
>> >         >     >         at
>> >         org.postgresql.core.PGStream.<init>(PGStream.java:66)
>> >         >     >         at
>> >         >     >
>> >         >
>> >
>>   org.postgresql.core.QueryExecutorBase.sendQueryCancel(QueryExecutorBase.java:155)
>> >         >     >         at
>> >         >     >
>> >
>>  org.postgresql.jdbc.PgConnection.cancelQuery(PgConnection.java:971)
>> >         >     >         at
>> >         >
>>  org.postgresql.jdbc.PgStatement.cancel(PgStatement.java:812)
>> >         >     >         at
>> >         org.postgresql.jdbc.PgStatement$1.run(PgStatement.java:880)
>> >         >     >         at
>> java.util.TimerThread.mainLoop(Timer.java:555)
>> >         >     >         at java.util.TimerThread.run(Timer.java:505)
>> >         >     > 2019-12-19 18:09:05,848 Job start thread ERROR Unable
to
>> >         write to
>> >         >     stream
>> >         >     > logs/manifoldcf.log for appender MyFile
>> >         >     > 2019-12-19 18:09:05,848 Seeding thread ERROR Unable to
>> >         write to stream
>> >         >     > logs/manifoldcf.log for appender MyFile
>> >         >     > 2019-12-19 18:09:05,848 Job reset thread ERROR Unable
to
>> >         write to
>> >         >     stream
>> >         >     > logs/manifoldcf.log for appender MyFile
>> >         >     > 2019-12-19 18:09:05,848 Job notification thread ERROR
>> >         Unable to
>> >         >     write to
>> >         >     > stream logs/manifoldcf.log for appender MyFile
>> >         >     > 2019-12-19 18:09:05,849 Seeding thread ERROR An
>> >         exception occurred
>> >         >     > processing Appender MyFile
>> >         >     >
>> >         org.apache.logging.log4j.core.appender.AppenderLoggingException:
>> >         Error
>> >         >     > flushing stream logs/manifoldcf.log
>> >         >     >         at
>> >         >     >
>> >         >
>> >
>>   org.apache.logging.log4j.core.appender.OutputStreamManager.flush(OutputStreamManager.java:159).
>> >         >     >
>> >         >     > _Also tried the approach to clean up Database by
>> >         truncating all
>> >         >     > manifoldcf related tables, but still getting this
>> error._
>> >         >     >
>> >         >     > Parameters defined in *postgresql conf *file is as
>> >         suggested :- and
>> >         >     > "max_pred_per_locks_transctions" is set to value "256".
>> >         >     > image.png
>> >         >
>> >
>>
>

Mime
View raw message