manifoldcf-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Karl Wright <daddy...@gmail.com>
Subject Re: Manifoldcf server Error
Date Fri, 27 Dec 2019 12:53:38 GMT
Hi Priya,

ManifoldCF is written in Java and Java uses garbage collection to "clean
up" memory.
The reason you are running out of memory is because you haven't given your
java processes enough of it.  See the -Xmx switch.

Karl


On Fri, Dec 27, 2019 at 6:16 AM Priya Arora <priya@smartshore.nl> wrote:

> Hi Karl,
>
> One thing I want to understand here is "Is there any functionality
> implemented with in code or some how", that if I have configured 3 jobs on
> Manifoldcf and out of which say one is done, after its done , it should
> clean up occupied memory.
> The reason I am asking this I have ran job and continuously analysed
> docker stats to view run time memory the job is occupying(total memory of
> server is 15GB)
> [image: image.png]
> After one job is done it memory usage was almost 2.3 GB , then I started
> another job , it started to take up memory after 2.3GB.
>
> How manifoldcf cleans up memory after  the process/job is completed.
>
> Thanks
> Priya
>
> On Fri, Dec 27, 2019 at 2:51 PM Karl Wright <daddywri@gmail.com> wrote:
>
>> The out of memory error will cause all sorts of errors.  You need to fix
>> that first.
>> Karl
>>
>> On Fri, Dec 27, 2019, 2:10 AM Priya Arora <priya@smartshore.nl> wrote:
>>
>>> HI karl,
>>>
>>> The reason I guess so that error can be due to postgres is because of
>>> below error logs:-
>>> Because it hints me out some error due to database.
>>>
>>> Dec 24, 2019 8:06:56 AM
>>> org.apache.tika.config.InitializableProblemHandler$3
>>> handleInitializableProblem
>>> WARNING: org.xerial's sqlite-jdbc is not loaded.
>>> Please provide the jar on your classpath to parse sqlite files.
>>> See tika-parsers/pom.xml for the correct version.
>>> agents process ran out of memory - shutting down
>>> java.lang.OutOfMemoryError: GC overhead limit exceeded
>>> [Thread-505] INFO org.eclipse.jetty.server.ServerConnector - Stopped
>>> ServerConnector@4b40f651{HTTP/1.1}{0.0.0.0:8345}
>>> [Thread-505] INFO org.eclipse.jetty.server.handler.ContextHandler -
>>> Stopped o.e.j.w.WebAppContext@64e7619d
>>> {/mcf-api-service,file:/tmp/jetty-0.0.0.0-8345-mcf-api-service.war-_mcf-api-service-any-1694133755722639545.dir/webapp/,UNAVAILABLE}{/usr/share/manifoldcf/example/./../web/war/mcf-api-service.war}
>>> [Thread-505] INFO org.eclipse.jetty.server.handler.ContextHandler -
>>> Stopped o.e.j.w.WebAppContext@5f0fd5a0
>>> {/mcf-authority-service,file:/tmp/jetty-0.0.0.0-8345-mcf-authority-service.war-_mcf-authority-service-any-5958972787774826450.dir/webapp/,UNAVAILABLE}{/usr/share/manifoldcf/example/./../web/war/mcf-authority-service.war}
>>> agents process ran out of memory - shutting down
>>> java.lang.OutOfMemoryError: GC overhead limit exceeded
>>> org.apache.manifoldcf.core.interfaces.ManifoldCFException: Database
>>> exception: SQLException doing query (02000): No results were returned by
>>> the query.
>>>         at
>>> org.apache.manifoldcf.core.database.Database$ExecuteQueryThread.finishUp(Database.java:715)
>>>         at
>>> org.apache.manifoldcf.core.database.Database.executeViaThread(Database.java:741)
>>>         at
>>> org.apache.manifoldcf.core.database.Database.executeUncachedQuery(Database.java:803)
>>>         at
>>> org.apache.manifoldcf.core.database.Database$QueryCacheExecutor.create(Database.java:1457)
>>>         at
>>> org.apache.manifoldcf.core.cachemanager.CacheManager.findObjectsAndExecute(CacheManager.java:146)
>>>         at
>>> org.apache.manifoldcf.core.database.Database.executeQuery(Database.java:204)
>>>         at
>>> org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.performQuery(DBInterfacePostgreSQL.java:837)
>>>         at
>>> org.apache.manifoldcf.crawler.jobs.JobManager.buildCountsUsingGroupBy(JobManager.java:9149)
>>>         at
>>> org.apache.manifoldcf.crawler.jobs.JobManager.makeJobStatus(JobManager.java:8855)
>>>         at
>>> org.apache.manifoldcf.crawler.jobs.JobManager.getAllStatus(JobManager.java:8742)
>>>         at
>>> org.apache.jsp.showjobstatus_jsp._jspService(showjobstatus_jsp.java:238)
>>>         at
>>> org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:70)
>>>         at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
>>>         at
>>> org.apache.jasper.servlet.JspServletWrapper.service(JspServletWrapper.java:388)
>>>         at
>>> org.apache.jasper.servlet.JspServlet.serviceJspFile(JspServlet.java:313)
>>>         at
>>> org.apache.jasper.servlet.JspServlet.service(JspServlet.java:260)
>>>         at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
>>>         at
>>> org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:769)
>>>         at
>>> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:585)
>>>         at
>>> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
>>>         at
>>> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:595)
>>>         at
>>> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:223)
>>>         at
>>> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1125)
>>>         at
>>> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:515)
>>>         at
>>> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
>>>         at
>>> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1059)
>>>
>>> On Thu, Dec 26, 2019 at 7:21 PM Karl Wright <daddywri@gmail.com> wrote:
>>>
>>>> Hi Priya,
>>>>
>>>> Deadlocks happen all the time in ManifoldCF and the software simply
>>>> retries.  So some of this is quite normal.  It should *not* however abort
>>>> the job or cause the agents process to shut down.
>>>>
>>>> What is the actual symptom you are seeing, operationally?
>>>>
>>>> Karl
>>>>
>>>>
>>>> On Thu, Dec 26, 2019 at 7:25 AM Priya Arora <priya@smartshore.nl>
>>>> wrote:
>>>>
>>>>> Yes its inside docker container. Also i checked postgres container
>>>>> logs and found out below errors there.
>>>>>
>>>>> This error log is huge, attaching some of the lines for reference.
>>>>> Almost in all queries process ID = is going NULL. Any clue ?? why this is
>>>>> happening so, if there is any configuration I am missing.
>>>>>
>>>>> ERROR:  could not serialize access due to read/write dependencies
>>>>> among transactions
>>>>> DETAIL:  Reason code: Canceled on identification as a pivot, during
>>>>> conflict out checking.
>>>>> HINT:  The transaction might succeed if retried.
>>>>> STATEMENT:  UPDATE intrinsiclink SET processid=null,isnew=$2 WHERE
>>>>> jobid=$3 AND parentidhash=$4 AND linktype=$5 AND childidhash=$6
>>>>> ERROR:  could not serialize access due to read/write dependencies
>>>>> among transactions
>>>>> DETAIL:  Reason code: Canceled on identification as a pivot, during
>>>>> write.
>>>>> HINT:  The transaction might succeed if retried.
>>>>> STATEMENT:  INSERT INTO jobqueue
>>>>> (jobid,docpriority,checktime,docid,needpriority,dochash,id,checkaction,status)
>>>>> VALUES ($1,$2,$3,$4,$5,$6,$7,$8,$9)
>>>>> ERROR:  could not serialize access due to read/write dependencies
>>>>> among transactions
>>>>> DETAIL:  Reason code: Canceled on identification as a pivot, during
>>>>> conflict out checking.
>>>>> HINT:  The transaction might succeed if retried.
>>>>> STATEMENT:  UPDATE intrinsiclink SET processid=null,isnew=$2 WHERE
>>>>> jobid=$3 AND parentidhash=$4 AND linktype=$5 AND childidhash=$6
>>>>> ERROR:  could not serialize access due to read/write dependencies
>>>>> among transactions
>>>>> DETAIL:  Reason code: Canceled on identification as a pivot, during
>>>>> write.
>>>>> HINT:  The transaction might succeed if retried.
>>>>> STATEMENT:  INSERT INTO jobqueue
>>>>> (jobid,docpriority,checktime,docid,needpriority,dochash,id,checkaction,status)
>>>>> VALUES ($1,$2,$3,$4,$5,$6,$7,$8,$9)
>>>>> ERROR:  could not serialize access due to read/write dependencies
>>>>> among transactions
>>>>> DETAIL:  Reason code: Canceled on identification as a pivot, during
>>>>> write.
>>>>> HINT:  The transaction might succeed if retried.
>>>>> STATEMENT:  INSERT INTO hopcount
>>>>> (jobid,parentidhash,distance,linktype,id,deathmark) VALUES
>>>>> ($1,$2,$3,$4,$5,$6)
>>>>> ERROR:  could not serialize access due to read/write dependencies
>>>>> among transactions
>>>>> DETAIL:  Reason code: Canceled on identification as a pivot, during
>>>>> conflict in checking.
>>>>> HINT:  The transaction might succeed if retried.
>>>>> STATEMENT:  INSERT INTO prereqevents (owner,eventname) VALUES ($1,$2)
>>>>> ERROR:  could not serialize access due to read/write dependencies
>>>>> among transactions
>>>>> DETAIL:  Reason code: Canceled on identification as a pivot, during
>>>>> write.
>>>>>
>>>>> On Thu, Dec 26, 2019 at 5:18 PM Karl Wright <daddywri@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> For the error you are reporting, where is it occurring?  If it is
>>>>>> occurring in the container, see if you can determine the difference between
>>>>>> your deployment setup there and the deployment setup you have OUTSIDE the
>>>>>> container, which I presume does not show any errors.
>>>>>>
>>>>>> Karl
>>>>>>
>>>>>>
>>>>>> On Thu, Dec 26, 2019 at 4:09 AM Priya Arora <priya@smartshore.nl>
>>>>>> wrote:
>>>>>>
>>>>>>> Yes this is inside a docker container. Manifoldcf version used is
>>>>>>> 2.14 and has been downloaded inside container using binary distribution and
>>>>>>> also i have checked postgresql jar version used is postgresql-42.1.3(path
>>>>>>> ::/usr/share/manifoldcf/lib#).Also postgresql is downloaded as a part of
>>>>>>> docker image pull (postgres version is :-postgres:9.6.10).
>>>>>>>
>>>>>>> More importantly, do you have a non-containerized version of
>>>>>>> ManifoldCF set up using the standard build process that you can compare
>>>>>>> against :- I am using this configuration(as above mentioned) on the
>>>>>>> client server through which we are accessing intranet sites.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Thu, Dec 26, 2019 at 2:12 PM Karl Wright <daddywri@gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> This is running in a container, correct?  What version of the
>>>>>>>> postgresql JDBC driver are you including?  Is it the version that we
>>>>>>>> download when you enter "ant make-deps"?  What version of postgresql are
>>>>>>>> you using?
>>>>>>>>
>>>>>>>> More importantly, do you have a non-containerized version of
>>>>>>>> ManifoldCF set up using the standard build process that you can compare
>>>>>>>> against?  It's really essential when you are trying to diagnose what's
>>>>>>>> going wrong in an alternate environment to have a NON alternate version
>>>>>>>> around to work with.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Karl
>>>>>>>>
>>>>>>>>
>>>>>>>> On Thu, Dec 26, 2019 at 2:49 AM Priya Arora <priya@smartshore.nl>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Hello Karl,
>>>>>>>>>
>>>>>>>>> Can you please help me on this. I am facing this error which is
>>>>>>>>> causing my crawler crashing problem.
>>>>>>>>>
>>>>>>>>> It's my humble request if you can help me out. I am struggling and
>>>>>>>>> searching on the internet from many days.
>>>>>>>>>
>>>>>>>>> Thanks
>>>>>>>>> Priya
>>>>>>>>>
>>>>>>>>> On Mon, Dec 23, 2019 at 2:21 PM Priya Arora <priya@smartshore.nl>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Hi All,
>>>>>>>>>>
>>>>>>>>>> I am facing below error while executing jobs on manifoldcf.
>>>>>>>>>> I had implemented all performance tuning parameters of postgres
>>>>>>>>>> as suggested on Manifoldcf documentation.
>>>>>>>>>> Can anybody suggests something on this- reason, solution of this.
>>>>>>>>>>
>>>>>>>>>> [image: image.png]
>>>>>>>>>> [image: image.png]
>>>>>>>>>>
>>>>>>>>>> Parmeters configured fo rtuning is
>>>>>>>>>> [image: image.png]
>>>>>>>>>> Thanks
>>>>>>>>>> Priya
>>>>>>>>>>
>>>>>>>>>> On Fri, Dec 20, 2019 at 4:43 PM Priya Arora <priya@smartshore.nl>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi Markus,
>>>>>>>>>>> Yes I also tried to start/restart the container with docker
>>>>>>>>>>> commands , it also results in error.
>>>>>>>>>>> Here is the complete stacktrace of logs:-
>>>>>>>>>>> [image: image.png]
>>>>>>>>>>>
>>>>>>>>>>> On Fri, Dec 20, 2019 at 4:39 PM Markus Schuch <
>>>>>>>>>>> markus_schuch@web.de> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Hi Priya,
>>>>>>>>>>>>
>>>>>>>>>>>> the container you trying to interactivily executing a command
>>>>>>>>>>>> with is no
>>>>>>>>>>>> longer running. It is not possible to execute command with
>>>>>>>>>>>> stopped
>>>>>>>>>>>> containers.
>>>>>>>>>>>>
>>>>>>>>>>>> The logger issues might be related to missing file system
>>>>>>>>>>>> permissions.
>>>>>>>>>>>> But thats a wild guess. Is there a "Caused by" part in the
>>>>>>>>>>>> stacktrace of
>>>>>>>>>>>> the AppenderLoggingException?
>>>>>>>>>>>>
>>>>>>>>>>>> Cheers,
>>>>>>>>>>>> Markus
>>>>>>>>>>>>
>>>>>>>>>>>> Am 20.12.2019 um 11:48 schrieb Priya Arora:
>>>>>>>>>>>> > Hi All,
>>>>>>>>>>>> >
>>>>>>>>>>>> > When i am trying to execute bash command inside manifoldcf
>>>>>>>>>>>> container
>>>>>>>>>>>> > getting error.
>>>>>>>>>>>> > image.png
>>>>>>>>>>>> > And when checking logs Sudo docker logs <CID>
>>>>>>>>>>>> > 2019-12-19 18:09:05,848 Job start thread ERROR Unable to
>>>>>>>>>>>> write to stream
>>>>>>>>>>>> > logs/manifoldcf.log for appender MyFile
>>>>>>>>>>>> > 2019-12-19 18:09:05,848 Seeding thread ERROR Unable to write
>>>>>>>>>>>> to stream
>>>>>>>>>>>> > logs/manifoldcf.log for appender MyFile
>>>>>>>>>>>> > 2019-12-19 18:09:05,848 Job reset thread ERROR Unable to
>>>>>>>>>>>> write to stream
>>>>>>>>>>>> > logs/manifoldcf.log for appender MyFile
>>>>>>>>>>>> > 2019-12-19 18:09:05,848 Job notification thread ERROR Unable
>>>>>>>>>>>> to write to
>>>>>>>>>>>> > stream logs/manifoldcf.log for appender MyFile
>>>>>>>>>>>> > 2019-12-19 18:09:05,849 Seeding thread ERROR An exception
>>>>>>>>>>>> occurred
>>>>>>>>>>>> > processing Appender MyFile org
>>>>>>>>>>>> >
>>>>>>>>>>>>  .apache.logging.log4j.core.appender.AppenderLoggingException: Error
>>>>>>>>>>>> > flushing stream logs/manifoldcf.log
>>>>>>>>>>>> >         at
>>>>>>>>>>>> >
>>>>>>>>>>>> org.apache.logging.log4j.core.appender.OutputStreamManager.flush(OutputStreamManager.java:159)
>>>>>>>>>>>> >
>>>>>>>>>>>> > Can any body suggest reason behind this error?
>>>>>>>>>>>> >
>>>>>>>>>>>> > Thanks
>>>>>>>>>>>> > Priya
>>>>>>>>>>>> >
>>>>>>>>>>>> > On Fri, Dec 20, 2019 at 3:37 PM Priya Arora <
>>>>>>>>>>>> priya@smartshore.nl
>>>>>>>>>>>> > <mailto:priya@smartshore.nl>> wrote:
>>>>>>>>>>>> >
>>>>>>>>>>>> >     Hi Markus,
>>>>>>>>>>>> >
>>>>>>>>>>>> >     Many thanks for your reply!!.
>>>>>>>>>>>> >
>>>>>>>>>>>> >     I tried this approach to reproduce the scenario in a
>>>>>>>>>>>> different
>>>>>>>>>>>> >     environment, but the case  where I listed the error above
>>>>>>>>>>>> is when I
>>>>>>>>>>>> >     am crawling INTRANET sites which can be accessible over a
>>>>>>>>>>>> remote
>>>>>>>>>>>> >     server. Also I have used Transformation connectors:-Allow
>>>>>>>>>>>> Documents,
>>>>>>>>>>>> >     Tika Parser, Content Limiter( 10000000), Metadata
>>>>>>>>>>>> Adjuster.
>>>>>>>>>>>> >
>>>>>>>>>>>> >     When tried reproducing the error with Public sites of the
>>>>>>>>>>>> same
>>>>>>>>>>>> >     domain and on a different server(DEV), it was successful,
>>>>>>>>>>>> with no
>>>>>>>>>>>> >     error.Also there was no any postgres related error.
>>>>>>>>>>>> >
>>>>>>>>>>>> >     Can it depends observer related configurations like
>>>>>>>>>>>> Firewall etc, as
>>>>>>>>>>>> >     this case include some firewall,security related
>>>>>>>>>>>> configurations.
>>>>>>>>>>>> >
>>>>>>>>>>>> >     Thanks
>>>>>>>>>>>> >     Priya
>>>>>>>>>>>> >
>>>>>>>>>>>> >
>>>>>>>>>>>> >
>>>>>>>>>>>> >
>>>>>>>>>>>> >     On Fri, Dec 20, 2019 at 3:23 PM Markus Schuch <
>>>>>>>>>>>> markus_schuch@web.de
>>>>>>>>>>>> >     <mailto:markus_schuch@web.de>> wrote:
>>>>>>>>>>>> >
>>>>>>>>>>>> >         Hi Priya,
>>>>>>>>>>>> >
>>>>>>>>>>>> >         in my experience, i would focus on the
>>>>>>>>>>>> OutOfMemoryError (OOME).
>>>>>>>>>>>> >         8 Gigs can be enough, but they don't have to.
>>>>>>>>>>>> >
>>>>>>>>>>>> >         At first i would check if the jvm is really getting
>>>>>>>>>>>> the desired heap
>>>>>>>>>>>> >         size. The dockered environment make that a little
>>>>>>>>>>>> harder find
>>>>>>>>>>>> >         find out,
>>>>>>>>>>>> >         since you need to get access to the jvm metrics, e.g.
>>>>>>>>>>>> via jmxremote.
>>>>>>>>>>>> >         Beeing able to monitor the jvm metrics helps you with
>>>>>>>>>>>> >         correlating the
>>>>>>>>>>>> >         errors with the heap and garbage collection activity.
>>>>>>>>>>>> >
>>>>>>>>>>>> >         The errors you see on postgresql jdbc driver might be
>>>>>>>>>>>> very
>>>>>>>>>>>> >         related to
>>>>>>>>>>>> >         the OOME.
>>>>>>>>>>>> >
>>>>>>>>>>>> >         Some question i would ask myself:
>>>>>>>>>>>> >
>>>>>>>>>>>> >         Do the problems repeatingly occur only when crawling
>>>>>>>>>>>> this specific
>>>>>>>>>>>> >         content source or only with this specific output
>>>>>>>>>>>> connection? Can you
>>>>>>>>>>>> >         reproduce it outside of docker in a controlled dev
>>>>>>>>>>>> environment?
>>>>>>>>>>>> >         Or is it
>>>>>>>>>>>> >         a more general problem with your manifoldcf instance?
>>>>>>>>>>>> >
>>>>>>>>>>>> >         May be there are some huge files beeing crawled in
>>>>>>>>>>>> your content
>>>>>>>>>>>> >         source?
>>>>>>>>>>>> >         To you have any kind of transformations configured?
>>>>>>>>>>>> (e.g.
>>>>>>>>>>>> >         content size
>>>>>>>>>>>> >         limit?) You should try to see in the job's history if
>>>>>>>>>>>> there are any
>>>>>>>>>>>> >         patterns, like the error rises always after
>>>>>>>>>>>> encountering the same
>>>>>>>>>>>> >         document xy.
>>>>>>>>>>>> >
>>>>>>>>>>>> >         Cheers
>>>>>>>>>>>> >         Markus
>>>>>>>>>>>> >
>>>>>>>>>>>> >
>>>>>>>>>>>> >
>>>>>>>>>>>> >         Am 20.12.2019 um 09:59 schrieb Priya Arora:
>>>>>>>>>>>> >         > Hi  Markus ,
>>>>>>>>>>>> >         >
>>>>>>>>>>>> >         > Heap size defined is 8GB. Manifoldcf
>>>>>>>>>>>> start-options-unix file
>>>>>>>>>>>> >         Xmx etc
>>>>>>>>>>>> >         > parameters is defined to have memory 8192mb.
>>>>>>>>>>>> >         >
>>>>>>>>>>>> >         > It seems to be an issue with memory also, and also
>>>>>>>>>>>> when
>>>>>>>>>>>> >         manifoldcf tries
>>>>>>>>>>>> >         > to communicate to Database. Do you
>>>>>>>>>>>> explicitly define somewhere
>>>>>>>>>>>> >         > connection timer when to communicate to postgres.
>>>>>>>>>>>> >         > Postgres is installed as a part of docker image
>>>>>>>>>>>> pull and then some
>>>>>>>>>>>> >         > changes in properties.xml(of manifoldcf) to connect
>>>>>>>>>>>> to database.
>>>>>>>>>>>> >         > On the other hand Elastic search is also holding
>>>>>>>>>>>> sufficient
>>>>>>>>>>>> >         memory and
>>>>>>>>>>>> >         > Manifoldcf is also provided with 8 cores CPU.
>>>>>>>>>>>> >         >
>>>>>>>>>>>> >         > Can you suggest some solution.
>>>>>>>>>>>> >         >
>>>>>>>>>>>> >         > Thanks
>>>>>>>>>>>> >         > Priya
>>>>>>>>>>>> >         >
>>>>>>>>>>>> >         > On Fri, Dec 20, 2019 at 2:23 PM Markus Schuch
>>>>>>>>>>>> >         <markus_schuch@web.de <mailto:markus_schuch@web.de>
>>>>>>>>>>>> >         > <mailto:markus_schuch@web.de <mailto:
>>>>>>>>>>>> markus_schuch@web.de>>>
>>>>>>>>>>>> >         wrote:
>>>>>>>>>>>> >         >
>>>>>>>>>>>> >         >     Hi Priya,
>>>>>>>>>>>> >         >
>>>>>>>>>>>> >         >     your manifoldcf JVM suffers from high garbage
>>>>>>>>>>>> collection
>>>>>>>>>>>> >         pressure:
>>>>>>>>>>>> >         >
>>>>>>>>>>>> >         >         java.lang.OutOfMemoryError: GC overhead
>>>>>>>>>>>> limit exceeded
>>>>>>>>>>>> >         >
>>>>>>>>>>>> >         >     What is your current heap size?
>>>>>>>>>>>> >         >     Without knowing that, i suggest to increase the
>>>>>>>>>>>> heap size.
>>>>>>>>>>>> >         (java
>>>>>>>>>>>> >         >     -Xmx...)
>>>>>>>>>>>> >         >
>>>>>>>>>>>> >         >     Cheers,
>>>>>>>>>>>> >         >     Markus
>>>>>>>>>>>> >         >
>>>>>>>>>>>> >         >     Am 20.12.2019 um 09:02 schrieb Priya Arora:
>>>>>>>>>>>> >         >     > Hi All,
>>>>>>>>>>>> >         >     >
>>>>>>>>>>>> >         >     > I am facing below error while accessing
>>>>>>>>>>>> Manifoldcf.
>>>>>>>>>>>> >         Requirement is to
>>>>>>>>>>>> >         >     > crawl data from a website using Repository as
>>>>>>>>>>>> "Web" and
>>>>>>>>>>>> >         Output
>>>>>>>>>>>> >         >     connector
>>>>>>>>>>>> >         >     > as "Elastic Search"
>>>>>>>>>>>> >         >     > Manifoldcf is configured inside a
>>>>>>>>>>>> docker container and also
>>>>>>>>>>>> >         >     postgres is
>>>>>>>>>>>> >         >     > used a docker container.
>>>>>>>>>>>> >         >     > When launching manifold getting below error
>>>>>>>>>>>> >         >     > image.png
>>>>>>>>>>>> >         >     >
>>>>>>>>>>>> >         >     > When checked logs:-
>>>>>>>>>>>> >         >     > *1)sudo docker exec -it 0b872dfafc5c tail
>>>>>>>>>>>> -1000
>>>>>>>>>>>> >         >     >
>>>>>>>>>>>> /usr/share/manifoldcf/example/logs/manifoldcf.log*
>>>>>>>>>>>> >         >     > FATAL 2019-12-20T06:06:13,176 (Stuffer
>>>>>>>>>>>> thread) - Error
>>>>>>>>>>>> >         tossed: Timer
>>>>>>>>>>>> >         >     > already cancelled.
>>>>>>>>>>>> >         >     > java.lang.IllegalStateException: Timer
>>>>>>>>>>>> already cancelled.
>>>>>>>>>>>> >         >     >         at
>>>>>>>>>>>> java.util.Timer.sched(Timer.java:397)
>>>>>>>>>>>> >         ~[?:1.8.0_232]
>>>>>>>>>>>> >         >     >         at
>>>>>>>>>>>> java.util.Timer.schedule(Timer.java:193)
>>>>>>>>>>>> >         ~[?:1.8.0_232]
>>>>>>>>>>>> >         >     >         at
>>>>>>>>>>>> >         >     >
>>>>>>>>>>>> >
>>>>>>>>>>>>  org.postgresql.jdbc.PgConnection.addTimerTask(PgConnection.java:1113)
>>>>>>>>>>>> >         >     > ~[postgresql-42.1.3.jar:42.1.3]
>>>>>>>>>>>> >         >     >         at
>>>>>>>>>>>> >         >     >
>>>>>>>>>>>> >
>>>>>>>>>>>>  org.postgresql.jdbc.PgStatement.startTimer(PgStatement.java:887)
>>>>>>>>>>>> >         >     > ~[postgresql-42.1.3.jar:42.1.3]
>>>>>>>>>>>> >         >     >         at
>>>>>>>>>>>> >         >     >
>>>>>>>>>>>> >
>>>>>>>>>>>>  org.postgresql.jdbc.PgStatement.executeInternal(PgStatement.java:427)
>>>>>>>>>>>> >         >     > ~[postgresql-42.1.3.jar:42.1.3]
>>>>>>>>>>>> >         >     >         at
>>>>>>>>>>>> >         >
>>>>>>>>>>>>  org.postgresql.jdbc.PgStatement.execute(PgStatement.java:354)
>>>>>>>>>>>> >         >     > ~[postgresql-42.1.3.jar:42.1.3]
>>>>>>>>>>>> >         >     >         at
>>>>>>>>>>>> >         >     >
>>>>>>>>>>>> >         >
>>>>>>>>>>>> >
>>>>>>>>>>>>   org.postgresql.jdbc.PgPreparedStatement.executeWithFlags(PgPreparedStatement.java:169)
>>>>>>>>>>>> >         >     > ~[postgresql-42.1.3.jar:42.1.3]
>>>>>>>>>>>> >         >     >         at
>>>>>>>>>>>> >         >     >
>>>>>>>>>>>> >         >
>>>>>>>>>>>> >
>>>>>>>>>>>>   org.postgresql.jdbc.PgPreparedStatement.executeUpdate(PgPreparedStatement.java:136)
>>>>>>>>>>>> >         >     > ~[postgresql-42.1.3.jar:42.1.3]
>>>>>>>>>>>> >         >     >         at
>>>>>>>>>>>> >         >     >
>>>>>>>>>>>> >
>>>>>>>>>>>>  org.postgresql.jdbc.PgConnection.isValid(PgConnection.java:1311)
>>>>>>>>>>>> >         >     > ~[postgresql-42.1.3.jar:42.1.3]
>>>>>>>>>>>> >         >     >         at
>>>>>>>>>>>> >         >     >
>>>>>>>>>>>> >         >
>>>>>>>>>>>> >
>>>>>>>>>>>>   org.apache.manifoldcf.core.jdbcpool.ConnectionPool.getConnection(ConnectionPool.java:92)
>>>>>>>>>>>> >         >     > ~[mcf-core.jar:?]
>>>>>>>>>>>> >         >     >         at
>>>>>>>>>>>> >         >     >
>>>>>>>>>>>> >         >
>>>>>>>>>>>> >
>>>>>>>>>>>>   org.apache.manifoldcf.core.database.ConnectionFactory.getConnectionWithRetries(ConnectionFactory.java:126)
>>>>>>>>>>>> >         >     > ~[mcf-core.jar:?]
>>>>>>>>>>>> >         >     >         at
>>>>>>>>>>>> >         >     >
>>>>>>>>>>>> >         >
>>>>>>>>>>>> >
>>>>>>>>>>>>   org.apache.manifoldcf.core.database.ConnectionFactory.getConnection(ConnectionFactory.java:75)
>>>>>>>>>>>> >         >     > ~[mcf-core.jar:?]
>>>>>>>>>>>> >         >     >         at
>>>>>>>>>>>> >         >     >
>>>>>>>>>>>> >         >
>>>>>>>>>>>> >
>>>>>>>>>>>>   org.apache.manifoldcf.core.database.Database.executeUncachedQuery(Database.java:797)
>>>>>>>>>>>> >         >     > ~[mcf-core.jar:?]
>>>>>>>>>>>> >         >     >         at
>>>>>>>>>>>> >         >     >
>>>>>>>>>>>> >         >
>>>>>>>>>>>> >
>>>>>>>>>>>>   org.apache.manifoldcf.core.database.Database$QueryCacheExecutor.create(Database.java:1457)
>>>>>>>>>>>> >         >     > ~[mcf-core.jar:?]
>>>>>>>>>>>> >         >     >         at
>>>>>>>>>>>> >         >     >
>>>>>>>>>>>> >         >
>>>>>>>>>>>> >
>>>>>>>>>>>>   org.apache.manifoldcf.core.cachemanager.CacheManager.findObjectsAndExecute(CacheManager.java:146)
>>>>>>>>>>>> >         >     > ~[mcf-core.jar:?]
>>>>>>>>>>>> >         >     >         at
>>>>>>>>>>>> >         >     >
>>>>>>>>>>>> >         >
>>>>>>>>>>>> >
>>>>>>>>>>>>   org.apache.manifoldcf.core.database.Database.executeQuery(Database.java:204)
>>>>>>>>>>>> >         >     > ~[mcf-core.jar:?]
>>>>>>>>>>>> >         >     >         at
>>>>>>>>>>>> >         >     >
>>>>>>>>>>>> >         >
>>>>>>>>>>>> >
>>>>>>>>>>>>   org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.performQuery(DBInterfacePostgreSQL.java:837)
>>>>>>>>>>>> >         >     > ~[mcf-core.jar:?]
>>>>>>>>>>>> >         >     >         at
>>>>>>>>>>>> >         >     >
>>>>>>>>>>>> >         >
>>>>>>>>>>>> >
>>>>>>>>>>>>   org.apache.manifoldcf.core.database.BaseTable.performQuery(BaseTable.java:221)
>>>>>>>>>>>> >         >     > ~[mcf-core.jar:?]
>>>>>>>>>>>> >         >     >         at
>>>>>>>>>>>> >         >     >
>>>>>>>>>>>> >         >
>>>>>>>>>>>> >          org.apache.manifoldcf.crawler.jobs.Jobs
>>>>>>>>>>>> .getActiveJobConnections(Jobs.java:736)
>>>>>>>>>>>> >         >     > ~[mcf-pull-agent.jar:?]
>>>>>>>>>>>> >         >     >         at
>>>>>>>>>>>> >         >     >
>>>>>>>>>>>> >         >
>>>>>>>>>>>> >
>>>>>>>>>>>>   org.apache.manifoldcf.crawler.jobs.JobManager.getNextDocuments(JobManager.java:2869)
>>>>>>>>>>>> >         >     > ~[mcf-pull-agent.jar:?]
>>>>>>>>>>>> >         >     >         at
>>>>>>>>>>>> >         >     >
>>>>>>>>>>>> >         >
>>>>>>>>>>>> >
>>>>>>>>>>>>   org.apache.manifoldcf.crawler.system.StufferThread.run(StufferThread.java:186)
>>>>>>>>>>>> >         >     > [mcf-pull-agent.jar:?]
>>>>>>>>>>>> >         >     > *2)sudo docker logs <CID> --tail 1000*
>>>>>>>>>>>> >         >     > Exception in thread
>>>>>>>>>>>> "PostgreSQL-JDBC-SharedTimer-1"
>>>>>>>>>>>> >         >     > java.lang.OutOfMemoryError: GC overhead limit
>>>>>>>>>>>> exceeded
>>>>>>>>>>>> >         >     >         at
>>>>>>>>>>>> java.util.ArrayList.iterator(ArrayList.java:840)
>>>>>>>>>>>> >         >     >         at
>>>>>>>>>>>> >         >     >
>>>>>>>>>>>> >         >
>>>>>>>>>>>> >
>>>>>>>>>>>>   java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1316)
>>>>>>>>>>>> >         >     >         at
>>>>>>>>>>>> >
>>>>>>>>>>>>  java.net.InetAddress.getAllByName0(InetAddress.java:1277)
>>>>>>>>>>>> >         >     >         at
>>>>>>>>>>>> >
>>>>>>>>>>>>  java.net.InetAddress.getAllByName(InetAddress.java:1193)
>>>>>>>>>>>> >         >     >         at
>>>>>>>>>>>> >
>>>>>>>>>>>>  java.net.InetAddress.getAllByName(InetAddress.java:1127)
>>>>>>>>>>>> >         >     >         at
>>>>>>>>>>>> >         java.net.InetAddress.getByName(InetAddress.java:1077)
>>>>>>>>>>>> >         >     >         at
>>>>>>>>>>>> >         >
>>>>>>>>>>>>  java.net.InetSocketAddress.<init>(InetSocketAddress.java:220)
>>>>>>>>>>>> >         >     >         at
>>>>>>>>>>>> >         org.postgresql.core.PGStream.<init>(PGStream.java:66)
>>>>>>>>>>>> >         >     >         at
>>>>>>>>>>>> >         >     >
>>>>>>>>>>>> >         >
>>>>>>>>>>>> >
>>>>>>>>>>>>   org.postgresql.core.QueryExecutorBase.sendQueryCancel(QueryExecutorBase.java:155)
>>>>>>>>>>>> >         >     >         at
>>>>>>>>>>>> >         >     >
>>>>>>>>>>>> >
>>>>>>>>>>>>  org.postgresql.jdbc.PgConnection.cancelQuery(PgConnection.java:971)
>>>>>>>>>>>> >         >     >         at
>>>>>>>>>>>> >         >
>>>>>>>>>>>>  org.postgresql.jdbc.PgStatement.cancel(PgStatement.java:812)
>>>>>>>>>>>> >         >     >         at
>>>>>>>>>>>> >
>>>>>>>>>>>>  org.postgresql.jdbc.PgStatement$1.run(PgStatement.java:880)
>>>>>>>>>>>> >         >     >         at
>>>>>>>>>>>> java.util.TimerThread.mainLoop(Timer.java:555)
>>>>>>>>>>>> >         >     >         at
>>>>>>>>>>>> java.util.TimerThread.run(Timer.java:505)
>>>>>>>>>>>> >         >     > 2019-12-19 18:09:05,848 Job start thread
>>>>>>>>>>>> ERROR Unable to
>>>>>>>>>>>> >         write to
>>>>>>>>>>>> >         >     stream
>>>>>>>>>>>> >         >     > logs/manifoldcf.log for appender MyFile
>>>>>>>>>>>> >         >     > 2019-12-19 18:09:05,848 Seeding thread ERROR
>>>>>>>>>>>> Unable to
>>>>>>>>>>>> >         write to stream
>>>>>>>>>>>> >         >     > logs/manifoldcf.log for appender MyFile
>>>>>>>>>>>> >         >     > 2019-12-19 18:09:05,848 Job reset thread
>>>>>>>>>>>> ERROR Unable to
>>>>>>>>>>>> >         write to
>>>>>>>>>>>> >         >     stream
>>>>>>>>>>>> >         >     > logs/manifoldcf.log for appender MyFile
>>>>>>>>>>>> >         >     > 2019-12-19 18:09:05,848 Job notification
>>>>>>>>>>>> thread ERROR
>>>>>>>>>>>> >         Unable to
>>>>>>>>>>>> >         >     write to
>>>>>>>>>>>> >         >     > stream logs/manifoldcf.log for appender MyFile
>>>>>>>>>>>> >         >     > 2019-12-19 18:09:05,849 Seeding thread ERROR
>>>>>>>>>>>> An
>>>>>>>>>>>> >         exception occurred
>>>>>>>>>>>> >         >     > processing Appender MyFile
>>>>>>>>>>>> >         >     >
>>>>>>>>>>>> >
>>>>>>>>>>>>  org.apache.logging.log4j.core.appender.AppenderLoggingException:
>>>>>>>>>>>> >         Error
>>>>>>>>>>>> >         >     > flushing stream logs/manifoldcf.log
>>>>>>>>>>>> >         >     >         at
>>>>>>>>>>>> >         >     >
>>>>>>>>>>>> >         >
>>>>>>>>>>>> >
>>>>>>>>>>>>   org.apache.logging.log4j.core.appender.OutputStreamManager.flush(OutputStreamManager.java:159).
>>>>>>>>>>>> >         >     >
>>>>>>>>>>>> >         >     > _Also tried the approach to clean up Database
>>>>>>>>>>>> by
>>>>>>>>>>>> >         truncating all
>>>>>>>>>>>> >         >     > manifoldcf related tables, but still getting
>>>>>>>>>>>> this error._
>>>>>>>>>>>> >         >     >
>>>>>>>>>>>> >         >     > Parameters defined in *postgresql conf *file
>>>>>>>>>>>> is as
>>>>>>>>>>>> >         suggested :- and
>>>>>>>>>>>> >         >     > "max_pred_per_locks_transctions" is set to
>>>>>>>>>>>> value "256".
>>>>>>>>>>>> >         >     > image.png
>>>>>>>>>>>> >         >
>>>>>>>>>>>> >
>>>>>>>>>>>>
>>>>>>>>>>>

Mime
View raw message