directory-api mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Emmanuel L├ęcharny <>
Subject Re: Prepare String
Date Tue, 05 Apr 2016 23:05:33 GMT
So for the record, after a couple of hours working on it tonite, I get
the DeepTrimToLowerNormalizer() working fine, with tests passing.

I was also able to improve the performances of the beast : from 20
seconds to normalize 10 000 000 or String like "xs crvtbynU 
Jikl7897790", down to 4.3s. I just assumed that most of the time, we
will deal with chars between 0x00 and 0x7F, and wrote a specific
function for that. If we have chars above 0x7F, then an exception is
thrown and we fell back to the complexe process, which will then take
47s instead of 20s.

So this is a balance :
- we have an implementation that covers all the chars, and takes 20s for
10M Strings
- we have an implementation that tries to process the String if chars
are in [0c00, 0x7F] and takes 4.3 s for 10M Strings, but takes 47
seconds if we have a char outside this range.

Beside the obvious gain, there is another reason why I wanted to do that
: processing IA5String values will benefit from this separation, and
that covers numerous AttributeTypes (like mail, homeDirectory,
memberUid, krb5principalname, krb5Realmname, and a lot more.

wdyt ? Going for an average of 20s no matter what, or accepting a huge
penalty when the String does not contain ASCII chars ?

View raw message