directory-api mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Emmanuel Lécharny <elecha...@gmail.com>
Subject Re: Filter/ExprNode refactoring
Date Thu, 21 Apr 2016 08:21:20 GMT
Le 18/04/16 12:12, Emmanuel Lécharny a écrit :
> Hi guys,
>
> this week-end, I spent some time reviewing the Filter/EprNode classes.
> It needs some refactoring, wrt the Value refactoring, and many other
> aspects that are currently harming, especially the escaping of some
> special chars.
>
> There are may issues :
> - the ExprNode is the result of a filter parsing, and it works well whn
> we are schema aware, not so well when we aren't. On the server side, we
> will have to normalize the vlaues accordingly to the String Preparation
> for each Qattribute we know of. On the client side, it's irrelevant.
> - as we convert the filter to a complex ASN.1 structure when sending it
> to a LDAP server, at some point, that means we convert the Strings to
> byte[], with all the escaped chars being transformed to their hex
> counterpart
> - we should offer a way for the user to don't bother about the chars
> that are going to be escaped. Typically, creating a SimpleNode( "cn",
> "You are a *" ); shoudl escape the '*' automatically.
>
> ALl those changes - and  some other - are not trivial, and I must admit
> I have to think a bit more about the impact on teh existing API.
>
> I'll try to work on that during the next few evenings...
Hi,

here is where I'm atm :
- all tests are passing green up to Apache Directory API Integration Tests.

Now, I'm facing some issues in the integ tests :

org.junit.ComparisonFailure: expected:<2.5.4.11=[some
people,0.9.2342.19200300.100.1.25=example,0.9.2342.19200300.100.1.25=com]>
but was:<2.5.4.11=[\ some  people\ ,0.9.2342.19200300.100.1.25=\
example\ ,0.9.2342.19200300.100.1.25=\ com\ ]>
    at org.junit.Assert.assertEquals(Assert.java:115)
    at org.junit.Assert.assertEquals(Assert.java:144)
    at
org.apache.directory.api.ldap.model.name.DnTest.testLdapNameToName(DnTest.java:1976)

for a test :

    @Test
    public void testLdapNameToName() throws Exception
    {
        Dn name = new Dn( "ou= Some   People   ", "dc = eXample", "dc=
cOm" );

        assertTrue( name.getName().equals( "ou= Some   People   ,dc =
eXample,dc= cOm" ) );

        Dn result = name.apply( schemaManager );

        assertEquals( "2.5.4.11=some
people,0.9.2342.19200300.100.1.25=example,0.9.2342.19200300.100.1.25=com",
            result.getNormName() );
    }


Here, the problem is that the Dn.getNormName() method will return a
normalized version of the DN, which will internally call all the
normalizers of each of the RDN. That will typically transform the " cOm"
string value to the prepared string " com ", and the "Some   People  "
string to " some  people ".

At this point, I'm questioning the rational for having a
Dn/Rdn/Ava.getNormName(). We do have a compareTo() and a compareValue()
method if we need to compare two Dns, two RDNs or two Avas.

The whole idea in the server was to be allowed to do quick comparisons
of full DN (like ADMIN_SYSTEM_DN_NORMALIZED =
"0.9.2342.19200300.100.1.1=admin,2.5.4.11=system" or
CN_SCHEMA_DN_NORMALIZED = "2.5.4.3=schema"). Comparing two strings will
clearly be faster than comparing two DNs.

Now, how fast ? That has to be measured. Also it comes with a cost :
every time we parse a DN, we have to construct this Normalized name for
each of its Ava, each of its Rdn and for the DN itself. This is a lot of
work. And when it comes to store the DN on disk, we serialize the
normalized name, which is also costly, and takes space on disk (twice,
because we store the RDN in the RDN index and in the entry itself.) We
also have to deserialize it when reading it back from disk.

So here are my thoughts :
- we have to check how slower it is to compare 2 DN vs compare 2 strings
- we have to measure the cost of creating a DN normMame, teh cost of
serializing it, the cost of deserialising it.
- once done, we will be able to decide what to do : dropping the
getNormName() or keep it

If we keep it, then a lot of tests will have to be fixed. Typically, in
the above test, the code would be :


    @Test
    public void testLdapNameToName() throws Exception
    {
        Dn name = new Dn( "ou= Some   People   ", "dc = eXample", "dc=
cOm" );

        assertTrue( name.getName().equals( "ou= Some   People   ,dc =
eXample,dc= cOm" ) );

        Dn result = name.apply( schemaManager );

        assertEquals( "2.5.4.11=\ some  people\
,0.9.2342.19200300.100.1.25=\ example\ ,0.9.2342.19200300.100.1.25=\ com\ ",
            result.getNormName() );
    }

(lots of added '\ ' ...)

Anyway, I'll be MIA for the next 10 days, with no internet (sort of) and
far from my computer, so I left you commenting and thinking about that,
and I'll do the mearuement when I'll be back.

Have fun at office while I'll be enjoying the beauty of Greece, sun,
sea, antic buildings and good greek food ;-)


Mime
View raw message