hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Doug Cutting (JIRA)" <j...@apache.org>
Subject [jira] Updated: (HADOOP-6884) Add LOG.isDebugEnabled() guard for each LOG.debug("...")
Date Thu, 19 Aug 2010 16:35:19 GMT

     [ https://issues.apache.org/jira/browse/HADOOP-6884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Doug Cutting updated HADOOP-6884:

    Attachment: FunAgain.java

You've benchmarked a case that isn't in the existing patch (3 variables) and certainly isn't
typical.  Your 'static' case still boxes.

So your benchmark mostly shows that boxing costs, and string concatenation costs even more
yet.  I don't see how it shows that any of these costs are significant in the Hadoop codebase.

I've attached a version that benchmarks the more typical single-parameter using slf4j and
a static version that avoids boxing.

The output this gives is:

java.version = 1.6.0_20
java.runtime.name = Java(TM) SE Runtime Environment
java.runtime.version = 1.6.0_20-b02
java.vm.version = 16.3-b01
java.vm.vendor = Sun Microsystems Inc.
java.vm.name = Java HotSpot(TM) Server VM
java.vm.specification.version = 1.0
java.specification.version = 1.6
os.arch = i386
os.name = Linux
os.version = 2.6.32-24-generic-pae

LOG.isDebugEnabled(): 13 ms
static debug(..)    : 78 ms
LOG.debug(..)       : 472 ms

What does this show?  It shows that simply switching to slf4j using format strings would remove
string concatenation costs speeding log statements by around 10x without any bloat.  It does
not show whether that improvement would be significant in any larger Hadoop benchmark, but
at least the cost in code readability would be null.  Further it shows that, if we find that
logging costs are significant somewhere due to boxing, we could optimize that with static
methods, gaining another 5x without losing any readability.  Finally, if we find that logging
costs are still somewhere significant, we can improve them yet another 6x with some impairment
to readability.  This is a bit like manual loop unrolling.  We'll do it in certain performance-critical
areas when it's shown to provide a significant advantage, but we shouldn't do it blindly for
every loop.

> Add LOG.isDebugEnabled() guard for each LOG.debug("...")
> --------------------------------------------------------
>                 Key: HADOOP-6884
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6884
>             Project: Hadoop Common
>          Issue Type: Improvement
>    Affects Versions: 0.22.0
>            Reporter: Erik Steffl
>            Assignee: Erik Steffl
>             Fix For: 0.22.0
>         Attachments: FunAgain.java, FunAgain.java, HADOOP-6884-0.22-1.patch, HADOOP-6884-0.22.patch
> Each LOG.debug("...") should be executed only if LOG.isDebugEnabled() is true, in some
cases it's expensive to construct the string that is being printed to log. It's much easier
to always use LOG.isDebugEnabled() because it's easier to check (rather than in each case
reason whether it's necessary or not).

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message