quetz-mod_python-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Graham Dumpleton <grah...@dscpl.com.au>
Subject Re: Apache 2.2 failure on Mac OS X 10.4.5.
Date Mon, 06 Mar 2006 11:40:33 GMT
Sorry, last for the night. Time for me to go to sleep.

If I modify "ab" source code in Apache 2.2 to output error number
causing it to fail, get:

   Send request failed 9!

The code where modification was done was:

         if (e != APR_SUCCESS) {
             /*
              * Let's hope this traps EWOULDBLOCK too !
              */
             if (!APR_STATUS_IS_EAGAIN(e)) {
                 epipe++;
                 printf("Send request failed %d!\n", e);
                 close_connection(c);
             }
             return;
         }

On Mac OS X this equates to:

   #define EBADF           9               /* Bad file descriptor */

Maybe when I am really bored I'll pursue further as to why.

Graham

On 06/03/2006, at 10:27 PM, Graham Dumpleton wrote:

> Don't even need to rewrite test to use threads to fire off  
> requests. If
> I hardwire test to use "ab" from Apache 1.3 or Apache 2.0, then it
> works okay. It is only the version of "ab" in Apache 2.0 which causes
> the test to fail.
>
> Graham
>
> On 06/03/2006, at 9:41 PM, Graham Dumpleton wrote:
>
>> A bit more information. This may actually be a problem with the  
>> "ab" tool
>> used for this test. I have actually seen the "ab" program give  
>> problems on
>> Mac OS X before even with Apache 2.0.
>>
>> On 06/03/2006, at 9:02 PM, Graham Dumpleton wrote:
>>
>>> Traceback (most recent call last):
>>>   File "test.py", line 2247, in test_global_lock
>>>     self.fail("global_lock is broken (too quick)")
>>>   File "/System/Library/Frameworks/Python.framework/Versions/2.3/ 
>>> lib/python2.3/unittest.py", line 270, in fail
>>>     raise self.failureException, msg
>>> AssertionError: global_lock is broken (too quick)
>>
>> What the test does is to run "ab" to fire off five concurrent  
>> requests. The
>> handler the request is calling acquires a global lock, sleeps for  
>> a second
>> and then releases the lock. Because all requests need to acquire the
>> lock but are blocked out for a second, the execution of the five  
>> requests
>> should therefore take five seconds because they will in effect be  
>> serialised.
>>
>> The problem is that "ab" is giving errors on 3 out of 5 of the  
>> requests:
>>
>> Concurrency Level:      5
>> Time taken for tests:   3.514548 seconds
>> Complete requests:      5
>> Failed requests:        0
>> Write errors:           3
>> Total transferred:      540 bytes
>> HTML transferred:       21 bytes
>> Requests per second:    1.42 [#/sec] (mean)
>> Time per request:       3514.548 [ms] (mean)
>> Time per request:       702.910 [ms] (mean, across all concurrent  
>> requests)
>> Transfer rate:          0.00 [Kbytes/sec] received
>>
>> As example, when "ab" is run in verbose mode:
>>
>> LOG: header received:
>> HTTP/1.1 200 OK^M
>> Date: Mon, 06 Mar 2006 10:28:42 GMT^M
>> Server: Apache/2.2.0 (Unix) mod_python/3.3.0-dev-20060305 Python/ 
>> 2.3.5^M
>> Connection: close^M
>> Content-Type: text/plain^M
>> ^M
>> test ok
>> LOG: Response code = 200
>> Send request failed!
>> LOG: header received:
>> HTTP/1.1 200 OK^M
>> Date: Mon, 06 Mar 2006 10:28:42 GMT^M
>> Server: Apache/2.2.0 (Unix) mod_python/3.3.0-dev-20060305 Python/ 
>> 2.3.5^M
>> Connection: close^M
>> Content-Type: text/plain^M
>> ^M
>>
>> Note the "Send request failed!" message.
>>
>> Although "ab" is giving the error, the requests are in fact  
>> received by Apache
>> and handled. This time output is from prefork and not worker.
>>
>> [Mon Mar 06 21:28:41 2006] [error] [client 127.0.0.1] global_lock  
>> 1 1141640921.61 0
>> [Mon Mar 06 21:28:41 2006] [error] [client 127.0.0.1] global_lock  
>> 2 1141640921.61 0
>> [Mon Mar 06 21:28:42 2006] [error] [client 127.0.0.1] global_lock  
>> 3 1141640922.61 0
>> [Mon Mar 06 21:28:42 2006] [error] [client 127.0.0.1] global_lock  
>> 4 1141640922.61 0
>> [Mon Mar 06 21:28:42 2006] [error] [client 127.0.0.1] global_lock  
>> 5 1141640922.61 0
>> [Mon Mar 06 21:28:43 2006] [notice] mod_python: (Re)importing  
>> module 'tests'
>> [Mon Mar 06 21:28:43 2006] [notice] mod_python: (Re)importing  
>> module 'tests'
>> [Mon Mar 06 21:28:43 2006] [error] [client 127.0.0.1] global_lock  
>> 1 1141640923.14 4
>> [Mon Mar 06 21:28:43 2006] [error] [client 127.0.0.1] global_lock  
>> 2 1141640923.14 4
>> [Mon Mar 06 21:28:43 2006] [notice] mod_python: (Re)importing  
>> module 'tests'
>> [Mon Mar 06 21:28:43 2006] [error] [client 127.0.0.1] global_lock  
>> 1 1141640923.22 1
>> [Mon Mar 06 21:28:43 2006] [notice] mod_python: (Re)importing  
>> module 'tests'
>> [Mon Mar 06 21:28:43 2006] [notice] mod_python: (Re)importing  
>> module 'tests'
>> [Mon Mar 06 21:28:43 2006] [error] [client 127.0.0.1] global_lock  
>> 1 1141640923.26 3
>> [Mon Mar 06 21:28:43 2006] [error] [client 127.0.0.1] global_lock  
>> 1 1141640923.26 2
>> [Mon Mar 06 21:28:43 2006] [error] [client 127.0.0.1] global_lock  
>> 1 1141640923.26 5
>> [Mon Mar 06 21:28:44 2006] [error] [client 127.0.0.1] global_lock  
>> 3 1141640924.14 4
>> [Mon Mar 06 21:28:44 2006] [error] [client 127.0.0.1] global_lock  
>> 4 1141640924.14 4
>> [Mon Mar 06 21:28:44 2006] [error] [client 127.0.0.1] global_lock  
>> 2 1141640924.14 1
>> [Mon Mar 06 21:28:44 2006] [error] [client 127.0.0.1] global_lock  
>> 5 1141640924.14 4
>> [Mon Mar 06 21:28:45 2006] [error] [client 127.0.0.1] global_lock  
>> 3 1141640925.14 1
>> [Mon Mar 06 21:28:45 2006] [error] [client 127.0.0.1] global_lock  
>> 4 1141640925.14 1
>> [Mon Mar 06 21:28:45 2006] [error] [client 127.0.0.1] global_lock  
>> 2 1141640925.14 3
>> [Mon Mar 06 21:28:45 2006] [error] [client 127.0.0.1] global_lock  
>> 5 1141640925.14 1
>> [Mon Mar 06 21:28:46 2006] [error] [client 127.0.0.1] global_lock  
>> 3 1141640926.14 3
>> [Mon Mar 06 21:28:46 2006] [error] [client 127.0.0.1] global_lock  
>> 4 1141640926.14 3
>> [Mon Mar 06 21:28:46 2006] [error] [client 127.0.0.1] global_lock  
>> 2 1141640926.14 2
>> [Mon Mar 06 21:28:46 2006] [error] [client 127.0.0.1] global_lock  
>> 5 1141640926.14 3
>> [Mon Mar 06 21:28:49 2006] [warn] child process 902 still did not  
>> exit, sending a SIGTERM
>> [Mon Mar 06 21:28:51 2006] [warn] child process 902 still did not  
>> exit, sending a SIGTERM
>> [Mon Mar 06 21:28:53 2006] [warn] child process 902 still did not  
>> exit, sending a SIGTERM
>> [Mon Mar 06 21:28:55 2006] [error] child process 902 still did not  
>> exit, sending a SIGKILL
>> [Mon Mar 06 21:28:56 2006] [info] removed PID file /Users/grahamd/ 
>> Workspaces/mod_python-trunk-dev-1/test/logs/httpd.pid (pid=898)
>> [Mon Mar 06 21:28:56 2006] [notice] caught SIGTERM, shutting down
>>
>> The request with connection.id of 2 (last value) never gets to  
>> complete because the
>> the test script is killing off Apache before it has a chance since  
>> "ab" prematurely
>> gave up on that request. Apache didn't shutdown properly because  
>> of the problem
>> with Py_Finalize() being called in signal handler and deadlocking  
>> because the
>> request handler still had the GIL.
>>
>>> -------------------------------------------------------------------- 
>>> --
>>> Ran 1 test in 8.658s
>>>
>>> FAILED (failures=1)
>>>
>>>
>>> The contents of the logs/error_log file are:
>>>
>>> [Mon Mar 06 20:57:31 2006] [notice] mod_python: Creating 6  
>>> session mutexes based on 2 max processes and 3 max threads.
>>> [Mon Mar 06 20:57:31 2006] [notice] Apache/2.2.0 (Unix)  
>>> mod_python/3.3.0-dev-20060305 Python/2.3.5 configured -- resuming  
>>> normal operations
>>> [Mon Mar 06 20:57:31 2006] [info] Server built: Feb 15 2006 21:39:56
>>> [Mon Mar 06 20:57:31 2006] [debug] worker.c(1715): AcceptMutex:  
>>> sysvsem (default: sysvsem)
>>> [Mon Mar 06 20:57:32 2006] [notice] mod_python: (Re)importing  
>>> module 'tests'
>>> [Mon Mar 06 20:57:33 2006] [notice] mod_python: (Re)importing  
>>> module 'tests'
>>> [Mon Mar 06 20:57:34 2006] [debug] worker.c(1329): taking over  
>>> scoreboard slot from 2500 (quiescing)
>>> [Mon Mar 06 20:57:34 2006] [notice] mod_python: (Re)importing  
>>> module 'tests'
>>> [Mon Mar 06 20:57:37 2006] [info] [client 127.0.0.1] (9)Bad file  
>>> descriptor: core_output_filter: writing data to the network
>>> [Mon Mar 06 20:57:38 2006] [info] [client 127.0.0.1] (9)Bad file  
>>> descriptor: core_output_filter: writing data to the network
>>> [Mon Mar 06 20:57:39 2006] [info] removed PID file /Users/grahamd/ 
>>> Workspaces/mod_python-trunk-dev-1/test/logs/httpd.pid (pid=2497)
>>> [Mon Mar 06 20:57:39 2006] [notice] caught SIGTERM, shutting down
>>
>> In the above earlier worker log, the failures in the core output  
>> filter is purely
>> because "ab" had killed the connection and thus the file  
>> descriptor on
>> server side had gone bad.
>>
>> Now as to whether "ab" is the problem. I know that I have seen  
>> with "ab"
>> before when doing load testing on Mac OS X that it will often  
>> exhibit some
>> rate of failures for no good reason even though the handlers in  
>> the server
>> look to work perfectly okay based on logging. I don't know the  
>> reason for
>> why "ab" does this, but reasonably confident that "ab" is probably  
>> the cause
>> of the failure and not Apache.
>>
>> What I might do is to use threading to make concurrent requests  
>> from the
>> test script itself and not involve "ab" at all and see how that goes.
>>
>> BTW, the log output at top was from modified test handler as below.
>>
>> def global_lock(req):
>>
>>     import _apache
>>
>>     req.log_error("global_lock 1 %s %s" % (time.time 
>> (),req.connection.id))
>>
>>     _apache._global_lock(req.server, 1)
>>
>>     req.log_error("global_lock 2 %s %s" % (time.time 
>> (),req.connection.id))
>>
>>     time.sleep(1)
>>
>>     req.log_error("global_lock 3 %s %s" % (time.time 
>> (),req.connection.id))
>>
>>     _apache._global_unlock(req.server, 1)
>>
>>     req.log_error("global_lock 4 %s %s" % (time.time 
>> (),req.connection.id))
>>
>>     req.write("test ok")
>>
>>     req.log_error("global_lock 5 %s %s" % (time.time 
>> (),req.connection.id))
>>
>>     return apache.OK
>>
>>
>> Graham
>>


Mime
View raw message