logging-log4j-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paul Duffin <pduf...@volantis.com>
Subject Re: FW: Slowness in LocationInfo construction
Date Fri, 23 Aug 2002 15:36:59 GMT
Mark Womack wrote:
> I am forwarding to the dev list so we will not forget about this.  Ceki is
> on vacation right now and I don't have time right this second.  Anyone else?
> 
> thanks,
> -Mark
> 
> -----Original Message-----
> From: Marc Prud'hommeaux [mailto:marc@solarmetric.com]
> Sent: Thursday, August 22, 2002 11:57 AM
> To: log4j-user@jakarta.apache.org
> Subject: Slowness in LocationInfo construction
> 
> 
> 
> I am using log4j-1.2.6.jar, and I am seeing that
> org.apache.log4j.spi.LoggingEvent.getLocationInformation is taking up
> 73% of my CPU time. This is very distressing, since the CategoryLogger
> is the default logger for the Apache Commons Logging project to use, so
> just calling org.apache.commons.logging.LogFactory.getLog ("foo").log
> ("bar")
> is going to show this performance hit. This is especially bad, since the
> logging output does not even wind up using information from the
> LocationInformation that takes so much time to create!
> 
> The CPU usage from OptimizeIt is as follows, after which I suggest a
> few performance improvements:
> 
>  Description of CPU usage for thread group main
>      73.58% - 427461.0 ms - java.lang.Throwable.printStackTrace()
>          73.58% - 427461.0 ms -
> org.apache.log4j.spi.LocationInfo.<init>()
>              73.58% - 427461.0 ms -
> org.apache.log4j.spi.LoggingEvent.getLocationInformation()
>                  73.58% - 427461.0 ms -
> org.apache.log4j.helpers.PatternParser$LocationPatternConverter.convert()
>                      73.58% - 427461.0 ms -
> org.apache.log4j.helpers.PatternConverter.format()
>                          73.58% - 427461.0 ms -
> org.apache.log4j.PatternLayout.format()
>                              73.58% - 427461.0 ms -
> org.apache.log4j.WriterAppender.subAppend()
>                                  73.58% - 427461.0 ms -
> org.apache.log4j.WriterAppender.append()
>                                      73.58% - 427461.0 ms -
> org.apache.log4j.AppenderSkeleton.doAppend()
>                                          73.58% - 427461.0 ms -
> org.apache.log4j.helpers.AppenderAttachableImpl.appendLoopOnAppenders()
>                                              73.58% - 427461.0 ms -
> org.apache.log4j.Category.callAppenders()
>                                                  73.58% - 427461.0 ms -
> org.apache.log4j.Category.forcedLog()
>                                                      73.58% - 427461.0
> ms - org.apache.log4j.Category.log()
>                                                          73.58% -
> 427461.0 ms - org.apache.commons.logging.impl.Log4JCategoryLog.info()
>                                                              73.58% -
> 427461.0 ms - com.solarmetric.datasource.DataSourceImpl.log()
> 
> 
> First off, in PatternParser$LocationPatternConverter, since the default
> switch statement falls through, it is a waste to be invoking
> getLocationInformation(), since it may not be used. E.g., the following
> code:
> 
>     public
>     String convert(LoggingEvent event) {
>       LocationInfo locationInfo = event.getLocationInformation();
>       switch(type) {
>       case FULL_LOCATION_CONVERTER:
>     return locationInfo.fullInfo;
>       case METHOD_LOCATION_CONVERTER:
>     return locationInfo.getMethodName();
>       case LINE_LOCATION_CONVERTER:
>     return locationInfo.getLineNumber();
>       case FILE_LOCATION_CONVERTER:
>     return locationInfo.getFileName();
>       default: return null;
>       }
>     }
>   }
> 
> ... could be changed to:
> 
>     public
>     String convert(LoggingEvent event) {
>       switch(type) {
>       case FULL_LOCATION_CONVERTER:
>     return event.getLocationInformation().fullInfo;
>       case METHOD_LOCATION_CONVERTER:
>     return event.getLocationInformation().getMethodName();
>       case LINE_LOCATION_CONVERTER:
>     return event.getLocationInformation().getLineNumber();
>       case FILE_LOCATION_CONVERTER:
>     return event.getLocationInformation().getFileName();
>       default: return null;
>       }
>     }
>   }
> 
> 
> 
> Secondly, LocationInfo should not dump the stack trace and do the
> initial parsing until it actually knows that the information will be
> needed. For some reason, the parsing is done in the constructor:
> 
> 
>     public LocationInfo(Throwable t, String fqnOfCallingClass) {
>       if(t == null)
>     return;
> 
>       String s;
>       // Protect against multiple access to sw.
>       synchronized(sw) {
>     t.printStackTrace(pw);
>     s = sw.toString();
>     sw.getBuffer().setLength(0);
>       }
> 
>      ... more parsing follows ...
> 

I modified the code to do this lazily only when asked. Unfortunately 
LocationInfo has a public visible field which is initialised in the 
constructor and this had to be made private and a getter method 
provided. This broke one of the other classes in log4j which referenced 
the field directly.

> 
> Thirdly, I don't understand why the StringWriter sw is static in this
> class: the overhead involed in synchronizing access to it outweights the
> memory hit that would be taken by just using a member variable.
> 

As it is currently written the array managed by the StringBuffer which 
underlies the StringWriter and the String which owns it become garbage 
after the parsing of the stack but the StringWriter, the StringBuffer 
and the PrintWriter can be reused. However, you may be correct that the 
synchronization cost is larger than the cost required to create and 
garbage collect those objects but it would be very platform dependent.

A much more efficient way which does not involve either synchronizing, 
or which generates a lot of garbage is for the caller which is 
requesting the location information to pass in an object which contains 
the necessary objects to parse the stack. Some callers may not be able 
to reuse the objects any more efficiently then now but some such as the 
AsyncAppender which are only running in one thread can reuse the object 
with almost no cost.

The other side to this is that you may need to write some custom classes 
which replace the StringWriter / StringBuffer combination, or some code 
which can search through the StringBuffer directly without converting to 
a String. Converting to a String means that the character array 
contained within the StringBuffer is owned by the String and cannot be 
reused and will therefore become garbage. Stack traces can be really 
quite long so you could be generating up to 1-2K of garbage for every 
log message which will have a serious impact on performance when the 
garbage collector runs.

> 
> Finally, since it is often the case that many log messages will be
> coming from the same line of code, it may make sense to keep a parse
> cache (perhaps in a SoftHashMap) of [stack trace string : parsed
> information], so that rapid multiple invocations of a logging message
> will not require the stack trace to be reparsed over and over.
> 

The problem with this is that while many log messages will be coming 
from the same line of code the stack trace below that may be different 
for each of the log messages. Your proposal means that you will have to 
maintain in memory the set of string representations of all possible 
paths through the code (which if it the code contains recursion is 
theoretically infinite). This is a huge amount of data for little 
benefit. If you analyse the parseThrowable method you will almost 
certainly see that the majority of the time is spent in the creating of 
the stack trace string not the parsing of it so your cache will not save 
any time as it will still need to generate the stack trace string.

Improving the speed of printing the stack trace will be much more 
beneficial. It also may be faster and more memory efficient for the 
object to which the stack trace is printed to perform the parsing rather 
than waiting for the whole stack to be printed and then parsing it. It 
all depends on whether the code which prints the stack trace does it in 
small bits such as lines or parts of lines, or constructs the whole lot 
and prints it in one go. Assuming that it does not construct the whole 
lot in one go then it may be faster to use a custom PrintWriter which 
does the following.

1) Accumulate the characters being written until a whole line has been
    retrieved.
2) Check to see whether the line contains the fully qualified name of
    the calling class, if it does not then go back to step 1, else if it
    does ignore this line and go to step 3.
3) Get the next line, remove any rubbish, save it away and go to step 4.
4) Ignore all other calls to the print methods.

After getting the Throwable to print itself to the custom PrintWriter 
the caller can simply query it to retrieve the name of the caller.


--
To unsubscribe, e-mail:   <mailto:log4j-dev-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:log4j-dev-help@jakarta.apache.org>


Mime
View raw message