manifoldcf-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Karl Wright <daddy...@gmail.com>
Subject RE: Zookeeper in Apache ManifoldCF
Date Wed, 02 Jul 2014 21:49:52 GMT
Hi lalit,

Each agents process in a cluster needs its own Id. Please look carefully at
the multiprocess zookeeper example for details how to do that.  If you
didn't intend for there to be multiple agents processes in one cluster, you
did something wrong, because that is what you have.

Karl

Sent from my Windows Phone
------------------------------
From: lalit jangra
Sent: 7/2/2014 2:11 PM
To: Karl Wright
Cc: user@manifoldcf.apache.org
Subject: Re: Zookeeper in Apache ManifoldCF

Hello,

I have configured 3 zookeeper instances on port 2181, 2182, 2183 on my
server and in mcf/dist/mulitprocess-zk-example i have configured all three
servers as comma separated list.

Now i have started all three zookeeper instances and i could see all three
running. Next i tried with a crawl job but in manifoldcf.logs, i can see
below error.

ERROR 2014-07-02 19:07:15,716 (Agents thread) - Exception tossed: Service
'' of type 'AGENT_org.apache.manifoldcf.crawler.system.CrawlerAgent' is
already active

org.apache.manifoldcf.core.interfaces.ManifoldCFException: Service '' of
type 'AGENT_org.apache.manifoldcf.crawler.system.CrawlerAgent' is already
active

        at
org.apache.manifoldcf.core.lockmanager.BaseLockManager.registerServiceBeginServiceActivity(BaseLockManager.java:156)

        at
org.apache.manifoldcf.core.lockmanager.BaseLockManager.registerServiceBeginServiceActivity(BaseLockManager.java:120)

        at
org.apache.manifoldcf.core.lockmanager.LockManager.registerServiceBeginServiceActivity(LockManager.java:69)

        at
org.apache.manifoldcf.agents.system.AgentsDaemon.checkAgents(AgentsDaemon.java:270)

        at
org.apache.manifoldcf.agents.system.AgentsDaemon$AgentsThread.run(AgentsDaemon.java:208)


How can i validate that these errors are not related to zookeeper or not?
Also how to know if MCF is integrated with zookeeper.


Regards.



On Tue, Jul 1, 2014 at 3:19 PM, Karl Wright <daddywri@gmail.com> wrote:

> Hi Lalit,
>
> I presumed in my recommendation that your "active" and "passive"
> manifoldcf instances were using the same PostgreSQL server, but were using
> different database instances within it.  That is the only way it could
> reasonable work.
>
> Any time you have a Zookeeper cluster, they recommend you have three
> instances.  Effectively you are setting up two ManifoldCF clusters: an
> "active" one, and a "passive" one.  Each one has its own database instance
> within PostgreSQL, and each one (if it is multiprocess) should have 3
> zookeeper instances.
>
> I hope this is clear.
>
> Karl
>
>
>
> On Tue, Jul 1, 2014 at 9:54 AM, lalit jangra <lalit.j.jangra@gmail.com>
> wrote:
>
>> Thanks Karl,
>>
>> I have a little variation here and this is about having both MCF nodes in
>> Active/Active nodes pointing to same DB, so still Zookeeper is required?
>>
>> Also does it mean by " two sets of three zookeeper machines",  i need to
>> setup three zookeepers onto each node so total 6 zookepeer node here
>> working on both machine in same  ensamble?
>>
>> Regards.
>>
>>
>> On Mon, Jun 30, 2014 at 6:50 PM, Karl Wright <daddywri@gmail.com> wrote:
>>
>>> Hi Lalit,
>>>
>>> You can keep things really simple by having both active and passive mcf
>>> instances run each as a single process, either under jetty or using the
>>> combined war under tomcat.  If that is not acceptable, you would need two
>>> sets of three zookeeper machines, one set for each instance.
>>>
>>> Karl
>>>
>>> Sent from my Windows Phone
>>> ------------------------------
>>> From: lalit jangra
>>> Sent: 6/30/2014 12:19 PM
>>> To: user@manifoldcf.apache.org
>>> Subject: Re: Zookeeper in Apache ManifoldCF
>>>
>>>  Thanks Karl & Graeme,
>>>
>>> Let me elaborate my scenario and what i am trying to achieve.
>>>
>>> I have two servers each running MCF 1.5.1 individually. But both of them
>>> are backed by same PostGreSQL DB so both of MCF applications are pointing
>>> to same DB at any point of time, without having their own dedicated DBs.
>>> Next, primary/active DB instance is  backed up with periodical backups from
>>> active to passive instance.
>>>
>>> Only one DB instance will be active at any time, with other DB instance
>>> acting as active standby. In case of breakdown of primary/active instance,
>>> passive/secondary will take over and becomes primary/active instance
>>> handling all DB transactions, thus making primary as new secondary DB
>>> instance.
>>>
>>> Similarly i have two solr 4.6 instances which act in active/passive mode
>>> with periodic backup of active/primary to passive/secondary with active
>>> standby and failover.
>>>
>>> So my intention of clustering is high availability of system with
>>> failover but i will not use both of MCF instances parallely or
>>> simultaneously.
>>>
>>> Finally i am limited to having two instances only but as mentioned
>>> earlier, we need at least three Zookeeper instances for a proper Zookeeper
>>> clustering.
>>>
>>> Is it still worthy to go and use Zookeeper or i can do simple clustering
>>> where each of MCF node is clustered using same DB. Please suggest.
>>>
>>> Thanks for help.
>>>
>>> Regards.
>>>
>>>
>>> On Fri, Jun 27, 2014 at 11:15 AM, Graeme Seaton <lists@graemes.com>
>>> wrote:
>>>
>>>>  Hi Lalit,
>>>>
>>>> For production use, you will want to spin up your own ZK cluster using
>>>> the instructions on the zookeeper site (as pointed out earlier at least 3
>>>> is recommended)....
>>>>
>>>> You then need to modify the properties.xml file in
>>>> multiprocess-zk-example to point to the list of Zookeeper servers.  You
>>>> also need to modify properties-global.xml with the appropriate global
>>>> settings i.e. logging levels, Postgresql database etc. and then run
>>>> setglobalproperties.sh to register the settings in ZK.
>>>>
>>>> To test that is working, set up a crawl and then tail the
>>>> manifoldcf.log file on each of your nodes to check that they are all
>>>> crawling in parallel.
>>>>
>>>> HTH,
>>>>
>>>> Graeme
>>>>
>>>>
>>>> On 25/06/14 12:19, Karl Wright wrote:
>>>>
>>>>  Hi Lalit,
>>>>
>>>> Zookeeper does not use a database; it keeps its stuff in the local file
>>>> system.  Each Zookeeper node has its own local data, and everything else
is
>>>> socket communication between them.
>>>>
>>>>  As for information: http://zookeeper.apache.org/
>>>>
>>>>  Karl
>>>>
>>>>
>>>>
>>>> On Wed, Jun 25, 2014 at 6:56 AM, lalit jangra <lalit.j.jangra@gmail.com
>>>> > wrote:
>>>>
>>>>>  Thanks Karl,
>>>>>
>>>>> Apologies as i am not very familiar with Zookeeper and trying to
>>>>> figure out on same.
>>>>>
>>>>> Is there any more documentation/pointers available for same as that
>>>>> would be more helpful.
>>>>>
>>>>>  Also i have 2 tomcat servers in cluster, each having MCF 1.5.1 setup
>>>>> and configured to point to same PostGreSQL DB & DB is backed up for
>>>>> failover. From your inputs, it seems that we need to configure a separate
>>>>> standalone Zookeeper server which will act as Master and both nodes in
>>>>> cluster will need to work as slaves and talk to standalone Zookeeper
master.
>>>>>
>>>>>  Also the Zookeeper server will have its own DB so either we can host
>>>>> it separately or we can use same Postgres DB?
>>>>>
>>>>>  Regards.
>>>>>
>>>>>
>>>>>
>>>>> On Wed, Jun 25, 2014 at 11:33 AM, Karl Wright <daddywri@gmail.com>
>>>>> wrote:
>>>>>
>>>>>>   Hi Lalit,
>>>>>>
>>>>>>  1. zookeeper is already spun into MCF.  in fact you start a
>>>>>> zookeeper instance when you run the mcf zookeeper example.  They
recommend,
>>>>>> though, that for failover you have 3 instances, etc.
>>>>>>  2. Looks like the documentation is out of date and something old
is
>>>>>> left in there.
>>>>>>  3. Zookeeper is a client/server kind of arrangement.  You need at
>>>>>> least ONE zookeeper server, and each cluster member includes a zookeeper
>>>>>> client, which is configured to talk with ALL the zookeeper server
instances
>>>>>> you have.
>>>>>>  4.  There is ONE database instance; the instance may be supported
by
>>>>>> failover and redundant Postgresql, but it appears as one instance.
 TO get
>>>>>> failover from Postgres you need the Enterprise Edition, which costs
money.
>>>>>>
>>>>>> Karl
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Wed, Jun 25, 2014 at 4:47 AM, lalit jangra <
>>>>>> lalit.j.jangra@gmail.com> wrote:
>>>>>>
>>>>>>>     Thanks Karl,
>>>>>>>
>>>>>>>  That was helpful.
>>>>>>>
>>>>>>>  I am setting clustered setup on Tomcats as i was following
>>>>>>> instructions @
>>>>>>> http://manifoldcf.apache.org/release/trunk/en_US/how-to-build-and-deploy.html#Simplified+multi-process+model+using+ZooKeeper-based+synchronization
>>>>>>> and i need some suggestions here.
>>>>>>>
>>>>>>>  1. Do we need to download zookeeper and put it in
>>>>>>> multiprocess-zk-example folder or it is already spun into MCF
and we are
>>>>>>> good to go?
>>>>>>>  2. It says all jars under *processes *should be put into classpath
>>>>>>> but i can not see any *processes *folder under MCF?
>>>>>>>  3. Do we need to setup Zookeeper on both nodes or only at one
node,
>>>>>>> i assume we need to do on both nodes ?
>>>>>>>  4. Do we also need to setup databases separately on both nodes
>>>>>>> again. Also can we setup Zookeeper DB using same PostGreSQL or
it will use
>>>>>>> its own HSQL DB?
>>>>>>>
>>>>>>>  Finally how can i test that my Zookeeper is setp and ready to
roll?
>>>>>>>
>>>>>>>  Thanks for your help.
>>>>>>>
>>>>>>> Regards.
>>>>>>>
>>>>>>>
>>>>>>>  On Tue, Jun 24, 2014 at 1:56 PM, Karl Wright <daddywri@gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>>  Hi Lalit,
>>>>>>>>  ZooKeeper is standard for cluster deployments these days.
 See the
>>>>>>>> multiprocess-zookeeper example for ideas about how to deploy
it.  It's also
>>>>>>>> important to read the how-to-build-and-deploy page to understand
the
>>>>>>>> example.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Karl
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Tue, Jun 24, 2014 at 8:04 AM, lalit jangra <
>>>>>>>> lalit.j.jangra@gmail.com> wrote:
>>>>>>>>
>>>>>>>>>  Hi,
>>>>>>>>>
>>>>>>>>>  I am planning to use MCF in cluster mode. For same,
i want to
>>>>>>>>> know if Zookeeper is of any help here?
>>>>>>>>>
>>>>>>>>>  If yes, how can it be leveraged in distributed MCF servers?
>>>>>>>>>
>>>>>>>>> Regards,
>>>>>>>>> Lalit Jangra.
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>  --
>>>>>>> Regards,
>>>>>>> Lalit Jangra.
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>>  --
>>>>> Regards,
>>>>> Lalit Jangra.
>>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> Regards,
>>> Lalit Jangra.
>>>
>>
>>
>>
>> --
>> Regards,
>> Lalit Jangra.
>>
>
>


-- 
Regards,
Lalit Jangra.

Mime
View raw message