directory-api mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stefan Seelmann <>
Subject Re: Prepare String
Date Wed, 06 Apr 2016 06:47:10 GMT
On 04/06/2016 01:05 AM, Emmanuel L├ęcharny wrote:
> So for the record, after a couple of hours working on it tonite, I get
> the DeepTrimToLowerNormalizer() working fine, with tests passing.
> I was also able to improve the performances of the beast : from 20
> seconds to normalize 10 000 000 or String like "xs crvtbynU 
> Jikl7897790", down to 4.3s. I just assumed that most of the time, we
> will deal with chars between 0x00 and 0x7F, and wrote a specific
> function for that. If we have chars above 0x7F, then an exception is
> thrown and we fell back to the complexe process, which will then take
> 47s instead of 20s.
> So this is a balance :
> - we have an implementation that covers all the chars, and takes 20s for
> 10M Strings
> - we have an implementation that tries to process the String if chars
> are in [0c00, 0x7F] and takes 4.3 s for 10M Strings, but takes 47
> seconds if we have a char outside this range.
> Beside the obvious gain, there is another reason why I wanted to do that
> : processing IA5String values will benefit from this separation, and
> that covers numerous AttributeTypes (like mail, homeDirectory,
> memberUid, krb5principalname, krb5Realmname, and a lot more.
> wdyt ? Going for an average of 20s no matter what, or accepting a huge
> penalty when the String does not contain ASCII chars ?

I'd go for the 2nd optimized way.

Is the cause of the penalty only the exception-throw-catch? Then maybe
it's worth to test if it improves when not throwing an excption but
returning a special flag (like null) and checking for that?

Kind Regards,

View raw message