commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gilles Sadowski <>
Subject Re: [Math] FastMath preset tables
Date Wed, 14 Sep 2011 21:02:45 GMT

> >
> >People taking part to this discussion[1] seem to have a hard time being
> >explicit about what they are trying to achieve.
> >
> >(1)
> >>From information gathered so far, the issue raised seems to have been solved
> >by taking advantage of the fact that the JVM loads classes at first use (i.e
> >methods will not be delayed by the loading of tables that they don't use).
> >
> >At least, this is what I conclude from my tests that compare code with and
> >without preset tables (which differ by less than 50 ms). [This has yet to be
> >confirmed by the initial poster who reported a unexpected difference of 30
> >microseconds!]
> According to Alexis post from today, the loading time is rather 5ms
> with one setting and 182ms with the reverse setting, so there is
> large factor (36 times).

The factor is large, but it does not matter really matter because you
multiply it by 1 (one): the gain is one-shot. That's why I think that it is
relevant to ask whether the application at hand will be restarted several
times per second.

> The initial low times are explained by his benchmark which simply
> computed class load time. With the initial situation, this was fait
> as it involved computing the tables, but with the current code it
> was not representative anymore since it did not even load the
> tables, as they are loaded on demand and on a per table basis.

That's what I thought (cf. the JIRA page).

> >
> >(2)
> >A _second_ issue has been bundled in the commits related to the initial
> >problem described above: Instead of computing the tables contents at
> >runtime, they are now set from litteral arrays.
> >
> >In addition to being non-consensual, it is still not clear that this change
> >is a necessary step to fix the reported problem.
> I don't understand your point. Tables can be computed at runtime or
> at compile time. Literal arrays is simply the easiest and most
> portable way to have compile-time arrays. Other options would
> involve really tricky steps that would be difficult to get right
> with different build systems. The build systems used currently are
> ant, maven2 and eclipse at least, and these systems are also used by
> Gump and Continuum. I think other people use other IDE too.
> When we dicussed this initially with Sebb, we ruled out such complex
> settings, and decided to go with literal arrays generated once.

Has this discussion taken place here, I must have missed it.
I don't see what complex settings you had considered.
I agree that if tables must be used, they should be generated once, and that
the litteral arrays are the simplest.

However, the first and main point is that I could not understand what was
the considered to be a "too large" initialization time (cf. my last comment
on the JIRA page).
Reasonable as it was to fix a minute-long startup, I did not think it
reasonable tograb for an additional tenth of a second.

> Is there another option we missed ?

[cf. below.]

> >
> >(3)
> >On a PC, comparing the old "FastMath" code (no IOD, no preset) with the
> >latest version, I get the following timing gain for a single call to
> >"pow" (i.e. a function that _uses_ the tables):
> >   130 ms (preset)
> >    80 ms (no preset)
> >So, indeed, using preset tables does make the first call run faster.
> >[On subsequent calls, the difference is less than 1 microseconds (cf.
> >"FastMathLoadCheck").]
> >
> >The issue is: When do we say that initialization time is too long?
> I think I am lost here. What do you call preset and no preset ?

preset = precomputed tables (aka litteral arrays)
no preset = tables computed at runtime

> >
> >On this machine:
> >   Intel(R) Core(TM)2 Quad CPU @ 2.40GHz
> >the difference is around 50 ms. Is that too long?
> >This will most probably be swamped in the execution time of any useful
> >application and in my opinion does not justify the workaround currently in
> >trunk.
> >
> >The slowliness reported initially (9 seconds to ~1 minute on a "low-end"
> >device) is indeed excessive.
> >But can we please draw the line at some meaningful value instead of
> >prematurely over-optimizing for a one-shot gain?
> I agree with you. From direct experience with the Android
> application, I experienced a loading delay slightly below one
> minute. I asked Alexis to do some benchmark and he reported most of
> the time was due to FastMath, thenk I asked him to open a Jira
> issue.
> I'm not sure anymore which benchmarks are flawed and which
> benchmarks are representative. Unfortunately, due to some low level
> kernel issue, I cannot do any benchmark by myself on my tablet
> except user level timing (I can't connect my tablet to my computer
> and run it in a monitored mode). So I am waiting for a new version
> of the complete application for such user-lvel timings, and cannot
> do anything about sub-second precise timings. I am sorry for that.

I do not own neither tablet nor smartphone, and so was not able to perform
the same check as I did on my machine (with the "PerfTestTUtils" class in
the "test" area of the repository). Hence I could only wonder about what was
being timed in the report by Alexis and hos to relate to what I was
observing here. [OTOH, the results from Sebb's benchmark were perfectly

Can you use "PerfTestUtils" on your devices? This is like using CM, so I
don't think that any kernel issue should prevent it.
I'll post the java file I used to benchmark calls to "pow" on the JIRA page.

> >
> >(4)
> >Can we also lay out rules about what consitutes an acceptable request for a
> >workaround?
> >
> >I don't think that is OK to just say that "FastMath" is too slow. The master
> >argument here was often that one should provide a (realistic) use case.
> >
> >I see that a faster startup time would benefit an application required to
> >be restarted several times per second. But how realistic would that be?
> This occurs in web services. This is a kind of application we get
> more and more often. I don't know at all how the server handles
> upcoming requests, and in particular if classes are reloaded or not,
> reoptimized or not. I know for sure the JVM is not restarted from
> scratch.

As Ted already pointed, it would really be impossible to run web services if
the JVM would be restarted for each request, or would even reload the
The "long" initialization of "FastMath" will be done once. And I bet that
the few hundred milliseconds we talk about are completely offset by the
initialization of the rest of the web server machinery.

> Another kind of application we have is small user computation
> (things akin to a pocket calculator). The android application we
> speak about belongs to this category. It is a space flight dynamics
> calculator that performs simple conversions (orbit conversions,
> frames conversions, time conversions, visibility detection,
> spacecraft impulse maneuvers). There are no high frequency
> repetition, but there are human factors. Typically, if you would
> have to wait more than one seconds to get the results of a
> multiplication in your pocet calculator, you would be upset.


> Here,
> we have to wait 57 seconds.

Then 100 ms plus or minus won't matter.

> The last benchmarks seem to imply
> FastMath was not the only culprit, despite what was initially
> identified. It is however part of the problem and for the web
> services case I think it is really worth improving its loading time.

As said above, I don't think that preset tables won't make any noticeable
difference for a web service. The more so if the computation is inherently
complex and the "incompressible" request time is already beyond a second.

> >And would "FastMath" be the single bottleneck in such a case?
> >Moreover, if there was such need to be able to restart the JVM several times
> >per second, then I'd draw the attention to the fact that "FastMath" is not
> >the right tool: Indeed, for the first call to "pow", it is still about 150
> >slower than "Math" or "StrictMath". Does that suggest that we must implement
> >some way so that users are able to select whether CM will use "Math" or
> >"FastMath"?
> >
> >(5)
> >On Sun, Sep 11, 2011 at 02:51:31PM +0100, sebb wrote:
> >>[...]
> >>
> >>I don't think minimising the class source file size is nearly as
> >>important as the startup time.
> >>
> >
> >First, it's not only about source size, but also code versus tables.
> >The former is self-descriptive.
> Yes, but we don't expect anybody to read the whole table. Reading
> only some comments above it pointing to the code that was used to
> generate them is sufficient.

Yes, maybe; I'm just pointing out that such arguments as I present are at
least as important as a 100 ms gain at startup, gain that dwindles as time
passes and computers become faster.

> >
> >Second, not only source file is larger, but so is bytecode size.
> >Without the preset tables, the ".class" file for was 38229 bytes long.
> >With all the changes to accomodate preset tables, there are now 5 ".class"
> >files with the following sizes:
> >   8172  FastMathCalc.class
> >  34671  FastMath.class
> >  35252  FastMath$ExpFracTable.class
> >  49944  FastMath$ExpIntTable.class
> >  39328  FastMath$lnMant.class
> >
> >For the same functionality, this results in more than a four-fold increase
> >in bytecode size.[2]
> Yes, so what ?
> Many performance problems end up with a trade-off between memory and
> execution time. As memory is cheap the current trend is to go to
> large tables in many places. Even inside processors, or any decent
> mathematical functions libraries, there are tables. One of the
> problems with such functions is even named the "table maker dilemma"
> (the term was coinde by Kahan if I remember well, who is well know
> for all his work on floating point arithmetic and IEEE standard).
> I would gladly accept tables up to a few megabytes.

You cannot, as I've pointed out early on in this discussion. A Java source
file, and Java code constructs have several limitations on size (64K).
That's why I proposed to move those tables to separate source files with the
advantage that it won't pollute a file that is manually edited. The table
files could be generated by a script.

> For now I would
> be worried by tables larger than several tens of megabytes. However,
> I am convinced that in 3 to 5 years for now I would say otherwise
> and would start saying that megabytes tables are small and worries
> start at gigagbytes. Now we have 3 tables and the largest is 50
> kilobytes, this is small and does almost fit within many processor
> caches, which are currently of the order of magnitude of 35kbytes
> for 5 years old processors (I don't have a more recent processor to
> check newer values).

If going in that direction (not discussing whether this good or bad),
I would say that we should surely not use litteral arrays but look at those
tables as "resources" and load them with the appropriate functionality.
This would then most clearly set them apart from the "real" code.


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message