manifoldcf-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Priya Arora <pr...@smartshore.nl>
Subject Re: Manifoldcf server Error
Date Fri, 20 Dec 2019 10:48:15 GMT
Hi All,

When i am trying to execute bash command inside manifoldcf container
getting error.
[image: image.png]
And when checking logs Sudo docker logs <CID>
2019-12-19 18:09:05,848 Job start thread ERROR Unable to write to stream
logs/manifoldcf.log for appender MyFile
2019-12-19 18:09:05,848 Seeding thread ERROR Unable to write to stream
logs/manifoldcf.log for appender MyFile
2019-12-19 18:09:05,848 Job reset thread ERROR Unable to write to stream
logs/manifoldcf.log for appender MyFile
2019-12-19 18:09:05,848 Job notification thread ERROR Unable to write to
stream logs/manifoldcf.log for appender MyFile
2019-12-19 18:09:05,849 Seeding thread ERROR An exception occurred
processing Appender MyFile org
 .apache.logging.log4j.core.appender.AppenderLoggingException: Error
flushing stream logs/manifoldcf.log
        at
org.apache.logging.log4j.core.appender.OutputStreamManager.flush(OutputStreamManager.java:159)

Can any body suggest reason behind this error?

Thanks
Priya

On Fri, Dec 20, 2019 at 3:37 PM Priya Arora <priya@smartshore.nl> wrote:

> Hi Markus,
>
> Many thanks for your reply!!.
>
> I tried this approach to reproduce the scenario in a different
> environment, but the case  where I listed the error above is when I am
> crawling INTRANET sites which can be accessible over a remote server. Also
> I have used Transformation connectors:-Allow Documents, Tika Parser,
> Content Limiter( 10000000), Metadata Adjuster.
>
> When tried reproducing the error with Public sites of the same domain and
> on a different server(DEV), it was successful, with no error.Also there was
> no any postgres related error.
>
> Can it depends observer related configurations like Firewall etc, as this
> case include some firewall,security related configurations.
>
> Thanks
> Priya
>
>
>
>
> On Fri, Dec 20, 2019 at 3:23 PM Markus Schuch <markus_schuch@web.de>
> wrote:
>
>> Hi Priya,
>>
>> in my experience, i would focus on the OutOfMemoryError (OOME).
>> 8 Gigs can be enough, but they don't have to.
>>
>> At first i would check if the jvm is really getting the desired heap
>> size. The dockered environment make that a little harder find find out,
>> since you need to get access to the jvm metrics, e.g. via jmxremote.
>> Beeing able to monitor the jvm metrics helps you with correlating the
>> errors with the heap and garbage collection activity.
>>
>> The errors you see on postgresql jdbc driver might be very related to
>> the OOME.
>>
>> Some question i would ask myself:
>>
>> Do the problems repeatingly occur only when crawling this specific
>> content source or only with this specific output connection? Can you
>> reproduce it outside of docker in a controlled dev environment? Or is it
>> a more general problem with your manifoldcf instance?
>>
>> May be there are some huge files beeing crawled in your content source?
>> To you have any kind of transformations configured? (e.g. content size
>> limit?) You should try to see in the job's history if there are any
>> patterns, like the error rises always after encountering the same
>> document xy.
>>
>> Cheers
>> Markus
>>
>>
>>
>> Am 20.12.2019 um 09:59 schrieb Priya Arora:
>> > Hi  Markus ,
>> >
>> > Heap size defined is 8GB. Manifoldcf start-options-unix file  Xmx etc
>> > parameters is defined to have memory 8192mb.
>> >
>> > It seems to be an issue with memory also, and also when manifoldcf tries
>> > to communicate to Database. Do you explicitly define somewhere
>> > connection timer when to communicate to postgres.
>> > Postgres is installed as a part of docker image pull and then some
>> > changes in properties.xml(of manifoldcf) to connect to database.
>> > On the other hand Elastic search is also holding sufficient memory and
>> > Manifoldcf is also provided with 8 cores CPU.
>> >
>> > Can you suggest some solution.
>> >
>> > Thanks
>> > Priya
>> >
>> > On Fri, Dec 20, 2019 at 2:23 PM Markus Schuch <markus_schuch@web.de
>> > <mailto:markus_schuch@web.de>> wrote:
>> >
>> >     Hi Priya,
>> >
>> >     your manifoldcf JVM suffers from high garbage collection pressure:
>> >
>> >         java.lang.OutOfMemoryError: GC overhead limit exceeded
>> >
>> >     What is your current heap size?
>> >     Without knowing that, i suggest to increase the heap size. (java
>> >     -Xmx...)
>> >
>> >     Cheers,
>> >     Markus
>> >
>> >     Am 20.12.2019 um 09:02 schrieb Priya Arora:
>> >     > Hi All,
>> >     >
>> >     > I am facing below error while accessing Manifoldcf. Requirement
>> is to
>> >     > crawl data from a website using Repository as "Web" and Output
>> >     connector
>> >     > as "Elastic Search"
>> >     > Manifoldcf is configured inside a docker container and also
>> >     postgres is
>> >     > used a docker container.
>> >     > When launching manifold getting below error
>> >     > image.png
>> >     >
>> >     > When checked logs:-
>> >     > *1)sudo docker exec -it 0b872dfafc5c tail -1000
>> >     > /usr/share/manifoldcf/example/logs/manifoldcf.log*
>> >     > FATAL 2019-12-20T06:06:13,176 (Stuffer thread) - Error tossed:
>> Timer
>> >     > already cancelled.
>> >     > java.lang.IllegalStateException: Timer already cancelled.
>> >     >         at java.util.Timer.sched(Timer.java:397) ~[?:1.8.0_232]
>> >     >         at java.util.Timer.schedule(Timer.java:193) ~[?:1.8.0_232]
>> >     >         at
>> >     >
>> org.postgresql.jdbc.PgConnection.addTimerTask(PgConnection.java:1113)
>> >     > ~[postgresql-42.1.3.jar:42.1.3]
>> >     >         at
>> >     > org.postgresql.jdbc.PgStatement.startTimer(PgStatement.java:887)
>> >     > ~[postgresql-42.1.3.jar:42.1.3]
>> >     >         at
>> >     >
>> org.postgresql.jdbc.PgStatement.executeInternal(PgStatement.java:427)
>> >     > ~[postgresql-42.1.3.jar:42.1.3]
>> >     >         at
>> >     org.postgresql.jdbc.PgStatement.execute(PgStatement.java:354)
>> >     > ~[postgresql-42.1.3.jar:42.1.3]
>> >     >         at
>> >     >
>> >
>>  org.postgresql.jdbc.PgPreparedStatement.executeWithFlags(PgPreparedStatement.java:169)
>> >     > ~[postgresql-42.1.3.jar:42.1.3]
>> >     >         at
>> >     >
>> >
>>  org.postgresql.jdbc.PgPreparedStatement.executeUpdate(PgPreparedStatement.java:136)
>> >     > ~[postgresql-42.1.3.jar:42.1.3]
>> >     >         at
>> >     > org.postgresql.jdbc.PgConnection.isValid(PgConnection.java:1311)
>> >     > ~[postgresql-42.1.3.jar:42.1.3]
>> >     >         at
>> >     >
>> >
>>  org.apache.manifoldcf.core.jdbcpool.ConnectionPool.getConnection(ConnectionPool.java:92)
>> >     > ~[mcf-core.jar:?]
>> >     >         at
>> >     >
>> >
>>  org.apache.manifoldcf.core.database.ConnectionFactory.getConnectionWithRetries(ConnectionFactory.java:126)
>> >     > ~[mcf-core.jar:?]
>> >     >         at
>> >     >
>> >
>>  org.apache.manifoldcf.core.database.ConnectionFactory.getConnection(ConnectionFactory.java:75)
>> >     > ~[mcf-core.jar:?]
>> >     >         at
>> >     >
>> >
>>  org.apache.manifoldcf.core.database.Database.executeUncachedQuery(Database.java:797)
>> >     > ~[mcf-core.jar:?]
>> >     >         at
>> >     >
>> >
>>  org.apache.manifoldcf.core.database.Database$QueryCacheExecutor.create(Database.java:1457)
>> >     > ~[mcf-core.jar:?]
>> >     >         at
>> >     >
>> >
>>  org.apache.manifoldcf.core.cachemanager.CacheManager.findObjectsAndExecute(CacheManager.java:146)
>> >     > ~[mcf-core.jar:?]
>> >     >         at
>> >     >
>> >
>>  org.apache.manifoldcf.core.database.Database.executeQuery(Database.java:204)
>> >     > ~[mcf-core.jar:?]
>> >     >         at
>> >     >
>> >
>>  org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.performQuery(DBInterfacePostgreSQL.java:837)
>> >     > ~[mcf-core.jar:?]
>> >     >         at
>> >     >
>> >
>>  org.apache.manifoldcf.core.database.BaseTable.performQuery(BaseTable.java:221)
>> >     > ~[mcf-core.jar:?]
>> >     >         at
>> >     >
>> >
>>  org.apache.manifoldcf.crawler.jobs.Jobs.getActiveJobConnections(Jobs.java:736)
>> >     > ~[mcf-pull-agent.jar:?]
>> >     >         at
>> >     >
>> >
>>  org.apache.manifoldcf.crawler.jobs.JobManager.getNextDocuments(JobManager.java:2869)
>> >     > ~[mcf-pull-agent.jar:?]
>> >     >         at
>> >     >
>> >
>>  org.apache.manifoldcf.crawler.system.StufferThread.run(StufferThread.java:186)
>> >     > [mcf-pull-agent.jar:?]
>> >     > *2)sudo docker logs <CID> --tail 1000*
>> >     > Exception in thread "PostgreSQL-JDBC-SharedTimer-1"
>> >     > java.lang.OutOfMemoryError: GC overhead limit exceeded
>> >     >         at java.util.ArrayList.iterator(ArrayList.java:840)
>> >     >         at
>> >     >
>> >
>>  java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1316)
>> >     >         at
>> java.net.InetAddress.getAllByName0(InetAddress.java:1277)
>> >     >         at
>> java.net.InetAddress.getAllByName(InetAddress.java:1193)
>> >     >         at
>> java.net.InetAddress.getAllByName(InetAddress.java:1127)
>> >     >         at java.net.InetAddress.getByName(InetAddress.java:1077)
>> >     >         at
>> >     java.net.InetSocketAddress.<init>(InetSocketAddress.java:220)
>> >     >         at org.postgresql.core.PGStream.<init>(PGStream.java:66)
>> >     >         at
>> >     >
>> >
>>  org.postgresql.core.QueryExecutorBase.sendQueryCancel(QueryExecutorBase.java:155)
>> >     >         at
>> >     >
>> org.postgresql.jdbc.PgConnection.cancelQuery(PgConnection.java:971)
>> >     >         at
>> >     org.postgresql.jdbc.PgStatement.cancel(PgStatement.java:812)
>> >     >         at
>> org.postgresql.jdbc.PgStatement$1.run(PgStatement.java:880)
>> >     >         at java.util.TimerThread.mainLoop(Timer.java:555)
>> >     >         at java.util.TimerThread.run(Timer.java:505)
>> >     > 2019-12-19 18:09:05,848 Job start thread ERROR Unable to write to
>> >     stream
>> >     > logs/manifoldcf.log for appender MyFile
>> >     > 2019-12-19 18:09:05,848 Seeding thread ERROR Unable to write to
>> stream
>> >     > logs/manifoldcf.log for appender MyFile
>> >     > 2019-12-19 18:09:05,848 Job reset thread ERROR Unable to write to
>> >     stream
>> >     > logs/manifoldcf.log for appender MyFile
>> >     > 2019-12-19 18:09:05,848 Job notification thread ERROR Unable to
>> >     write to
>> >     > stream logs/manifoldcf.log for appender MyFile
>> >     > 2019-12-19 18:09:05,849 Seeding thread ERROR An exception occurred
>> >     > processing Appender MyFile
>> >     > org.apache.logging.log4j.core.appender.AppenderLoggingException:
>> Error
>> >     > flushing stream logs/manifoldcf.log
>> >     >         at
>> >     >
>> >
>>  org.apache.logging.log4j.core.appender.OutputStreamManager.flush(OutputStreamManager.java:159).
>> >     >
>> >     > _Also tried the approach to clean up Database by truncating all
>> >     > manifoldcf related tables, but still getting this error._
>> >     >
>> >     > Parameters defined in *postgresql conf *file is as suggested :-
>> and
>> >     > "max_pred_per_locks_transctions" is set to value "256".
>> >     > image.png
>> >
>>
>

Mime
View raw message