ambari-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jayush Luniya <jlun...@hortonworks.com>
Subject Re: Server unit tests take too long (30+ minutes)
Date Wed, 25 Mar 2015 03:45:08 GMT
Done. 

https://issues.apache.org/jira/browse/AMBARI-10197

Thanks
Jayush


On 3/24/15, 7:50 PM, "Jonathan Hurley" <jhurley@hortonworks.com> wrote:

>Ah, I see that. Looks like TestController.TestController is a common
>theme here then. I tried running the tests on CentOS 6 instead of OSX and
>it looks like mine hung on test_certSigningFailed the first time and
>test_heartbeat_no_host_check_cmd_in_queue the second time.
>
>Let’s open up a Jira for this so it can be tracked and resolved.
>
>> On Mar 24, 2015, at 7:20 PM, Jayush Luniya <jluniya@hortonworks.com>
>>wrote:
>> 
>> Hi Jonathan,
>> Yes, as I mentioned the UT tests hang which is not 100% repro. The BOA
>>is
>> aborted after 2 hours.
>> 
>> However the builds always hang during Ambari Agent Test. If you see the
>> logs further up, you will see that the actual abort happened during the
>> TestController UTs (I.e. Python was terminated), but the build was not
>>yet
>> entirely terminated and hence we continue building the ambari client,
>> python client until it was completely aborted.
>> 
>> test_addToStatusQueue (TestController.TestController) ... ok
>> test_certSigningFailed (TestController.TestController) ... ok
>> test_heartbeatWithServer (TestController.TestController) ... ok
>> test_registerAndHeartbeat (TestController.TestController) ... ok
>> test_registerAndHeartbeatWithException (TestController.TestController)
>>...
>> ok
>> test_registerAndHeartbeat_check_registration_listener
>> (TestController.TestController) ... Build timed out (after 120 minutes).
>> Marking the build as aborted.
>> Build was aborted
>> 
>>/home/jenkins/jenkins-slave/workspace/Ambari-trunk-Commit/ambari-agent/..
>>/a
>> mbari-common/src/main/unix/ambari-python-wrap: line 40: 31955 Terminated
>>           $PYTHON "$@"
>> [INFO]          
>> 
>> [INFO] 
>> ------------------------------------------------------------------------
>> [INFO] Building Ambari Client 2.0.0-SNAPSHOT
>> [INFO] 
>> ------------------------------------------------------------------------
>> [INFO] 
>> [INFO] --- maven-clean-plugin:2.5:clean (default-clean) @ ambari-client
>>---
>> [INFO] Deleting 
>> /home/jenkins/jenkins-slave/workspace/Ambari-trunk-Commit/ambari-client
>> (includes = [**/*.pyc], excludes = [])
>> [INFO] 
>> [INFO] --- build-helper-maven-plugin:1.8:regex-property
>> (parse-package-version) @ ambari-client ---
>> [INFO] 
>> [INFO] --- build-helper-maven-plugin:1.8:regex-property
>> (parse-package-release) @ ambari-client ---
>> [INFO] 
>> [INFO] --- apache-rat-plugin:0.11:check (default) @ ambari-client ---
>> [INFO] 53 implicit excludes (use -debug for more details).
>> [INFO] No excludes explicitly specified.
>> [INFO] 2 resources included (use -debug for more details)
>> [INFO] Rat check: Summary of files. Unapproved: 0 unknown: 0 generated:
>>0
>> approved: 2 licence.
>> [INFO] 
>> [INFO] --- maven-assembly-plugin:2.2-beta-5:single (build-tarball) @
>> ambari-client ---
>> [INFO] Reading assembly descriptor: assemblies/client.xml
>> [INFO] 
>> [INFO] --- maven-assembly-plugin:2.2-beta-5:single (make-assembly) @
>> ambari-client ---
>> [INFO] Reading assembly descriptor: assemblies/client.xml
>> [INFO] 
>> [INFO] --- maven-install-plugin:2.4:install (default-install) @
>> ambari-client ---
>> [INFO] Installing
>> 
>>/home/jenkins/jenkins-slave/workspace/Ambari-trunk-Commit/ambari-client/p
>>om
>> .xml to 
>> 
>>/home/jenkins/.m2/repository/org/apache/ambari/ambari-client/2.0.0-SNAPSH
>>OT
>> /ambari-client-2.0.0-SNAPSHOT.pom
>> [INFO]          
>> 
>> [INFO] 
>> ------------------------------------------------------------------------
>> [INFO] Building Ambari Python Client 2.0.0-SNAPSHOT
>> [INFO] 
>> ------------------------------------------------------------------------
>> [INFO] 
>> [INFO] --- maven-clean-plugin:2.5:clean (default-clean) @ python-client
>>---
>> [INFO] Deleting 
>> 
>>/home/jenkins/jenkins-slave/workspace/Ambari-trunk-Commit/ambari-client/p
>>yt
>> hon-client (includes = [**/*.pyc], excludes = [])
>> [INFO] 
>> [INFO] --- build-helper-maven-plugin:1.8:regex-property
>> (parse-package-version) @ python-client ---
>> [INFO] 
>> [INFO] --- build-helper-maven-plugin:1.8:regex-property
>> (parse-package-release) @ python-client ---
>> [INFO] 
>> [INFO] --- exec-maven-plugin:1.2:exec (python-test) @ python-client ---
>> Updating AMBARI-10163
>> Recording test results
>> Warning: you have no plugins providing access control for builds, so
>> falling back to legacy behavior of permitting any downstream builds to
>>be
>> triggered
>> Finished: ABORTED
>> 
>> Thanks
>> Jayush
>> 
>> On 3/24/15, 1:25 PM, "Jonathan Hurley" <jhurley@hortonworks.com> wrote:
>> 
>>> I think that we¹re looking in the wrong places. Consider:
>>> 
>>> https://builds.apache.org/job/Ambari-trunk-Commit/2101
>>> and
>>> https://builds.apache.org/job/Ambari-trunk-Commit/2100
>>> 
>>> 2101 successfully built in about an hour. 2100 did not; it aborted
>>>after
>>> 2 hours. It aborted during the Groovy unit tests. Ambari unit test time
>>> variances should not swing the total job time by an hour.
>>> 
>>> Perhaps something else is going gone here. Maybe there¹s a network
>>>issue
>>> and Git or one of the maven build steps is taking too long.
>>> 
>>> The pattern seems to be that the builds are not stuck since they are
>>> aborted at different stages in between jobs. Groovy, agent tests, etc.
>>> 
>>> 
>>> On Mar 24, 2015, at 4:07 PM, Jonathan Hurley
>>> <jhurley@hortonworks.com<mailto:jhurley@hortonworks.com>> wrote:
>>> 
>>> No, that change should have no effect on the tests. There were aborted
>>> runs before that change, and there were failed runs after it. It seems
>>> like in some cases, the tests just take too long.
>>> 
>>> On Mar 24, 2015, at 3:55 PM, Jayush Luniya
>>> <jluniya@hortonworks.com<mailto:jluniya@hortonworks.com>> wrote:
>>> 
>>> This is the change that went in in build#2072.
>>> 
>>> Jonathan, any change the issue below could have been caused by it?
>>> Sumit, what was the commit version of your change to reenable
>>> TestController tests and when was it committed?
>>> 
>>> 
>>> 1. AMBARI-10126 <https://issues.apache.org/jira/browse/AMBARI-10126> -
>>> Alert Scheduler Is Double Scheduling Jobs (jonathanhurley) (details
>>> 
>>><https://builds.apache.org/job/Ambari-trunk-Commit/2072/changes#detail0>
>>>)
>>> 
>>> Commit 68468feeeeb35ca9edd4899ea8b1abafb7c2742a
>>> 
>>><http://git-wip-us.apache.org/repos/asf?p=ambari.git&a=commit&h=68468fee
>>>ee
>>> b
>>> 35ca9edd4899ea8b1abafb7c2742a> by jhurley
>>> <https://builds.apache.org/user/jhurley/>AMBARI-10126
>>> <https://issues.apache.org/jira/browse/AMBARI-10126> - Alert Scheduler
>>>Is
>>> Double Scheduling Jobs (jonathanhurley)
>>> 
>>> ambari-agent/src/main/python/ambari_agent/Controller.py
>>> 
>>><http://git-wip-us.apache.org/repos/asf?p=ambari.git&a=blob&f=ambari-age
>>>nt
>>> /
>>> 
>>>src/main/python/ambari_agent/Controller.py&h=bb85337bfdf2404a6aabf78eb36
>>>1c
>>> 1
>>> 12f77c977e&hb=68468feeeeb35ca9edd4899ea8b1abafb7c2742a> (diff)
>>> 
>>><http://git-wip-us.apache.org/repos/asf?p=ambari.git&a=blobdiff&f=ambari
>>>-a
>>> g
>>> 
>>>ent/src/main/python/ambari_agent/Controller.py&fp=ambari-agent/src/main/
>>>py
>>> t
>>> 
>>>hon/ambari_agent/Controller.py&h=eeca4c294399e04dae8d893f078d6e6125f3df4
>>>7&
>>> h
>>> 
>>>p=bb85337bfdf2404a6aabf78eb361c112f77c977e&hb=68468feeeeb35ca9edd4899ea8
>>>b1
>>> a
>>> bafb7c2742a&hpb=32e1215639f3cdfea68e2955f316576f1ded85fe>
>>> 
>>> 
>>> Thanks
>>> Jayush
>>> 
>>> On 3/24/15, 12:49 PM, "Sumit Mohanty"
>>> <smohanty@hortonworks.com<mailto:smohanty@hortonworks.com>> wrote:
>>> 
>>> The TestController are the tests I re-enabled to run on mac recently.
>>>So
>>> we may see these failures locally as well if your dev box is mac.
>>> ________________________________________
>>> From: Jayush Luniya
>>> <jluniya@hortonworks.com<mailto:jluniya@hortonworks.com>>
>>> Sent: Tuesday, March 24, 2015 12:24 PM
>>> To: Alejandro Fernandez;
>>> dev@ambari.apache.org<mailto:dev@ambari.apache.org>
>>> Subject: Re: Server unit tests take too long (30+ minutes)
>>> 
>>> Agreed we should take a look at reducing our test times.
>>> 
>>> Also, I looked at the latest builds on trunk, looks like there agent
>>> tests are hanging as well leading to builds being aborted. Culprit
>>>seems
>>> to be TestController tests. This is not a consistent failure but
>>>happens
>>> very frequently since build#2072
>>> https://builds.apache.org/job/Ambari-trunk-Commit/
>>> 
>>> 
>>> test_repeatRegistration (TestController.TestController) ... ok
>>> test_restartAgent (TestController.TestController) ... ok
>>> test_run (TestController.TestController) ... Build timed out (after 120
>>> minutes). Marking the build as aborted.
>>> Build was aborted
>>> 
>>>/home/jenkins/jenkins-slave/workspace/Ambari-trunk-Commit/ambari-agent/.
>>>./
>>> ambari-common/src/main/unix/ambari-python-wrap: line 40: 20024
>>>Terminated
>>>         $PYTHON "$@"
>>> 
>>> Thanks
>>> Jayush
>>> 
>>> From: Alejandro Fernandez
>>> <afernandez@hortonworks.com<mailto:afernandez@hortonworks.com>>
>>> Date: Tuesday, March 24, 2015 at 12:18 PM
>>> To: "dev@ambari.apache.org<mailto:dev@ambari.apache.org>"
>>> <dev@ambari.apache.org<mailto:dev@ambari.apache.org>>
>>> Cc: Jayush Luniya
>>> <jluniya@hortonworks.com<mailto:jluniya@hortonworks.com>>
>>> Subject: Re: Server unit tests take too long (30+ minutes)
>>> 
>>> +1 to that.
>>> 
>>> grep -B1 ".*sec$" ~/test_times.txt | sed 's/^.*Time elapsed:
>>>\(.*\)$/\1/'
>>> 
>>> Here's another run with all tests that took over 30 secs. Total time in
>>> these 28 test classes was 28 mins.
>>> The biggest culprit was AmbariManagementControllerTest at 5:28
>>> 
>>> Running org.apache.ambari.server.agent.TestHeartbeatHandler
>>> 89.435 sec
>>> 
>>> Running org.apache.ambari.server.upgrade.UpgradeTest
>>> 76.566 sec
>>> 
>>> Running
>>> 
>>>org.apache.ambari.server.security.authorization.AmbariLdapAuthentication
>>>Pr
>>> oviderForDNWithSpaceTest
>>> 55.582 sec
>>> 
>>> Running org.apache.ambari.server.security.authorization.TestUsers
>>> 43.228 sec
>>> 
>>> Running
>>> 
>>>org.apache.ambari.server.security.authorization.AmbariLdapAuthentication
>>>Pr
>>> oviderTest
>>> 57.922 sec
>>> 
>>> Running
>>> 
>>>org.apache.ambari.server.controller.internal.StackDefinedPropertyProvide
>>>rT
>>> est
>>> 56.585 sec
>>> 
>>> Running
>>> 
>>>org.apache.ambari.server.controller.internal.RepositoryVersionResourcePr
>>>ov
>>> iderTest
>>> 60.788 sec
>>> 
>>> Running
>>> 
>>>org.apache.ambari.server.controller.internal.UpgradeResourceProviderTest
>>> 40.329 sec
>>> 
>>> Running
>>> 
>>>org.apache.ambari.server.controller.internal.HostStackVersionResourcePro
>>>vi
>>> derTest
>>> 34.812 sec
>>> 
>>> Running
>>> org.apache.ambari.server.controller.internal.StageResourceProviderTest
>>> 37.434 sec
>>> 
>>> Running org.apache.ambari.server.controller.AmbariServerTest
>>> 37.638 sec
>>> 
>>> Running 
>>>org.apache.ambari.server.controller.AmbariManagementControllerTest
>>> 317.327 sec
>>> 
>>> Running org.apache.ambari.server.actionmanager.TestActionDBAccessorImpl
>>> 53.404 sec
>>> 
>>> Running org.apache.ambari.server.scheduler.ExecutionScheduleManagerTest
>>> 34.245 sec
>>> 
>>> Running
>>> org.apache.ambari.server.notifications.dispatchers.SNMPDispatcherTest
>>> 34.732 sec
>>> 
>>> Running org.apache.ambari.server.state.UpgradeHelperTest
>>> 35.616 sec
>>> 
>>> Running org.apache.ambari.server.state.alerts.AlertEventPublisherTest
>>> 62.627 sec
>>> 
>>> Running org.apache.ambari.server.state.alerts.AlertDefinitionHashTest
>>> 42.206 sec
>>> 
>>> Running 
>>>org.apache.ambari.server.state.alerts.AlertStateChangedEventTest
>>> 41.462 sec
>>> 
>>> Running org.apache.ambari.server.state.stack.UpgradePackTest
>>> 72.379 sec
>>> 
>>> Running org.apache.ambari.server.state.ConfigHelperTest
>>> 72.849 sec
>>> 
>>> Running
>>> org.apache.ambari.server.state.svccomphost.ServiceComponentHostTest
>>> 50.383 sec
>>> 
>>> Running org.apache.ambari.server.state.cluster.ClusterTest
>>> 69.889 sec
>>> 
>>> Running org.apache.ambari.server.state.cluster.ClusterDeadlockTest
>>> 80.271 sec
>>> 
>>> Running org.apache.ambari.server.state.ServiceTest
>>> 45.443 sec
>>> 
>>> Running org.apache.ambari.server.orm.dao.AlertsDAOTest
>>> 57.077 sec
>>> 
>>> Running org.apache.ambari.server.orm.dao.AlertDefinitionDAOTest
>>> 33.872 sec
>>> 
>>> Running org.apache.ambari.server.metadata.RoleCommandOrderTest
>>> 31.794 sec
>>> 
>>> Thanks,
>>> Alejandro
>>> 
>>> On 3/24/15, 11:54 AM, "Jonathan Hurley"
>>> <jhurley@hortonworks.com<mailto:jhurley@hortonworks.com>> wrote:
>>> 
>>> Many of these, such as the deadlock tests and alert tests are just
>>>going
>>> to take a long time due to the nature of what they're doing. In
>>>general,
>>> if b.a.o is timing out, we need to either increase the timeout for the
>>> job or change our pom.xml to allow for forked execution of the tests.
>>> 
>>> In my local environment, 3 concurrent forks can run through the test
>>> suite in about 20 minutes. The problem is that both LDAP tests below
>>> always fail in a forked environment. I'd say if we want to get the
>>>build
>>> times down, we should look into making the 2 LDAP tests work with
>>>forked
>>> test runners in the pom.xml
>>> 
>>> On Mar 24, 2015, at 2:33 PM, Sumit Mohanty
>>> <smohanty@hortonworks.com<mailto:smohanty@hortonworks.com>> wrote:
>>> ?Hi,
>>> these are some of the unit tests that take too long (more than 30
>>>seconds
>>> on my machine).  There are several that are above 10 seconds but below
>>>30
>>> seconds range that can also use some optimization.
>>> Jayush tells me that the Apache builds may be getting aborted as the
>>> build + UT run takes more than an hour.
>>> I will look into some of it when I get a chance. If there are any that
>>> piques your curiosity then take a look.
>>> Running org.apache.ambari.server.agent.TestHeartbeatHandler
>>> Tests run: 34, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 67.43
>>>sec
>>> Running org.apache.ambari.server.state.cluster.ClusterTest
>>> Tests run: 25, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 55.576
>>> sec
>>> Running org.apache.ambari.server.state.cluster.ClusterDeadlockTest
>>> Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 52.252
>>>sec
>>> Running org.apache.ambari.server.upgrade.UpgradeTest
>>> Tests run: 8, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 50.433
>>>sec
>>> Running org.apache.ambari.server.orm.dao.AlertDispatchDAOTest
>>> Tests run: 25, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 46.681
>>> sec
>>> Running org.apache.ambari.server.orm.dao.AlertsDAOTest
>>> Tests run: 22, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 44.474
>>> sec
>>> Running org.apache.ambari.server.security.authorization.TestUsers
>>> Tests run: 26, Failures: 0, Errors: 0, Skipped: 1, Time elapsed: 36.421
>>> sec
>>> Running
>>> 
>>>org.apache.ambari.server.security.authorization.AmbariLdapAuthentication
>>>Pr
>>> oviderTest
>>> Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 36.46
>>>sec
>>> Running
>>> 
>>>org.apache.ambari.server.security.authorization.AmbariLdapAuthentication
>>>Pr
>>> oviderForDNWithSpaceTest
>>> Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 35.706
>>>sec
>>> Running org.apache.ambari.server.state.ConfigHelperTest
>>> Tests run: 14, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 31.863
>>> sec
>>> Running
>>> 
>>>org.apache.ambari.server.controller.internal.StackDefinedPropertyProvide
>>>rT
>>> est
>>> Tests run: 14, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 31.247
>>> sec
>>> ...
>>> thanks
>>> ?-Sumit
>>> 
>>> 
>>> 
>>> 
>>> 
>> 
>

Mime
View raw message