atlas-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF subversion and git services (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (ATLAS-3168) PatchFx: Support for HA Mode
Date Thu, 25 Apr 2019 05:54:00 GMT

    [ https://issues.apache.org/jira/browse/ATLAS-3168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16825753#comment-16825753
] 

ASF subversion and git services commented on ATLAS-3168:
--------------------------------------------------------

Commit 982123e46f88b6777ba07bc9d6b21068f9495863 in atlas's branch refs/heads/master from Ashutosh
Mestry
[ https://gitbox.apache.org/repos/asf?p=atlas.git;h=982123e ]

ATLAS-3168: PatchFx: Unit test fixes and optimization.


> PatchFx: Support for HA Mode
> ----------------------------
>
>                 Key: ATLAS-3168
>                 URL: https://issues.apache.org/jira/browse/ATLAS-3168
>             Project: Atlas
>          Issue Type: Bug
>          Components:  atlas-core
>    Affects Versions: 2.0.0, trunk
>            Reporter: Ashutosh Mestry
>            Assignee: Ashutosh Mestry
>            Priority: Major
>             Fix For: 2.0.0, trunk
>
>         Attachments: ATLAS-3168-PatchFx-Fix-for-Startup-in-HA-mode.patch, ATLAS-3168-PatchFx-Unit-test-fixes-and-optimization.patch
>
>
> *Description*
> PatchFx in HA mode causes exceptions.
> *Steps to Duplicate*
> Deploy latest version of Atlas on a cluster with HA deployment.
> Following error appears during startup:
> {code:java}
> 2019-04-23 03:54:22,280 ERROR - [main-EventThread:] ~ Got exception while activating
(ActiveInstanceElectorService:160)
> java.lang.NullPointerException
>         at org.apache.atlas.repository.audit.HBaseBasedAuditRepository.createTableIfNotExists(HBaseBasedAuditRepository.java:521)
>         at org.apache.atlas.repository.audit.HBaseBasedAuditRepository.instanceIsActive(HBaseBasedAuditRepository.java:627)
>         at org.apache.atlas.web.service.ActiveInstanceElectorService.isLeader(ActiveInstanceElectorService.java:154)
>         at org.apache.curator.framework.recipes.leader.LeaderLatch$9.apply(LeaderLatch.java:665)
>         at org.apache.curator.framework.recipes.leader.LeaderLatch$9.apply(LeaderLatch.java:661)
>         at org.apache.curator.framework.listen.ListenerContainer$1.run(ListenerContainer.java:93)
>         at org.apache.curator.shaded.com.google.common.util.concurrent.MoreExecutors$DirectExecutor.execute(MoreExecutors.java:435)
>         at org.apache.curator.framework.listen.ListenerContainer.forEach(ListenerContainer.java:85)
>         at org.apache.curator.framework.recipes.leader.LeaderLatch.setLeadership(LeaderLatch.java:660)
>         at org.apache.curator.framework.recipes.leader.LeaderLatch.checkLeadership(LeaderLatch.java:539)
>         at org.apache.curator.framework.recipes.leader.LeaderLatch.access$700(LeaderLatch.java:65)
>         at org.apache.curator.framework.recipes.leader.LeaderLatch$7.processResult(LeaderLatch.java:590)
>         at org.apache.curator.framework.imps.CuratorFrameworkImpl.sendToBackgroundCallback(CuratorFrameworkImpl.java:865)
>         at org.apache.curator.framework.imps.CuratorFrameworkImpl.processBackgroundOperation(CuratorFrameworkImpl.java:635)
>         at org.apache.curator.framework.imps.WatcherRemovalFacade.processBackgroundOperation(WatcherRemovalFacade.java:152)
>         at org.apache.curator.framework.imps.GetChildrenBuilderImpl$2.processResult(GetChildrenBuilderImpl.java:187)
>         at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:602)
>         at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:510)
> 2019-04-23 03:54:22,280 WARN  - [main-EventThread:] ~ Server instance with server id
id2 is removed as leader (ActiveInstanceElectorService:197)
> {code}
> *Root Cause*
> Pattern followed within Atlas:
>  * _Service.start_ is called when _Services_ is initialized.
>  * For every service:
>  ** Atlas is not in HA mode: Start and perform startup specific actions.
>  ** Atlas is in HA mode: Start and wait for _instanceIsActive_ to be called.
>  * _AtlasPatchService_ did not implement _ActiveStateChangeHandler_.
>  * _AtlasPatchService_ was not registered with _ActiveStateChangeHandler.HandlerOrder_.
> This cause _AtlasPatchService.start_ to perform its job of patching the database. This
happened without _AtlasTypeDefStoreInitializer_ initialized. This cause exceptions. _ActiveInstanceElectoral_
service got callback from ZK asking it to call the _instanceIsActive_ method on _HBaseRepositoryService_,
which had not been started. This caused the exception to show the stack trace.
> *Solution*
> Modify _AtlasPatchService_ to follow the pattern used for other services.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message