uima-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Richard Eckart de Castilho (Jira)" <...@uima.apache.org>
Subject [jira] [Issue Comment Deleted] (UIMA-6136) FSIndexComparatorImpl.equalsWithoutType() gets slow for many CASes with the same TS
Date Wed, 23 Oct 2019 18:25:00 GMT

     [ https://issues.apache.org/jira/browse/UIMA-6136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Richard Eckart de Castilho updated UIMA-6136:
---------------------------------------------
    Comment: was deleted

(was: Here you go:

 

*import* java.util.ArrayList;

*import* java.util.List;

 

*import* org.apache.uima.cas.CAS;

*import* org.apache.uima.fit.factory.JCasFactory;

*import* org.apache.uima.jcas.JCas;

*import* org.apache.uima.resource.metadata.TypeDescription;

*import* org.apache.uima.resource.metadata.TypeSystemDescription;

*import* org.apache.uima.resource.metadata.impl.TypeSystemDescription_impl;

 

*public* *class* Uima6136Test

{

    *public* *static* *void* main(String[] args) *throws* Exception

    {

        TypeSystemDescription tsd = *new* TypeSystemDescription_impl();

        *for* (*int* i = 0; i < 500; i++) {

            TypeDescription type = tsd.addType("Type_" + i, "", CAS.*TYPE_NAME_ANNOTATION*);

            *for* (*int* f = 0; f < 10; f++) {

                type.addFeature("Feature_" + f, "", CAS.*TYPE_NAME_STRING*);

            }

        }

        

        List<JCas> cas = *new* ArrayList<>();

        *for* (*int* i = 0; i < 2000; i++) {

            *long* start = System.currentTimeMillis();

            cas.add(JCasFactory.createJCas(tsd));

            *long* duration = System.currentTimeMillis() - start;

            System.*out*.printf("%d - %d%n", i, duration);

        }

    }

})

> FSIndexComparatorImpl.equalsWithoutType() gets slow for many CASes with the same TS
> -----------------------------------------------------------------------------------
>
>                 Key: UIMA-6136
>                 URL: https://issues.apache.org/jira/browse/UIMA-6136
>             Project: UIMA
>          Issue Type: Bug
>          Components: UIMA
>    Affects Versions: 3.1.0SDK
>            Reporter: Richard Eckart de Castilho
>            Priority: Minor
>         Attachments: 2019-10-21_22-23-37.png, 2019-10-21_22-44-25.png
>
>
> When creating several hundred CASes with the same type system, the `shareExisting` mechanism
which is designed to save on memory starts eating into CPU time quite a lot.
> This screenshot shows that in my particular case, the method is called ~11mio times and
takes the bulk of the processing time. The call hierarchy is a bit messed up though - actually
this happens when the CASes are initialized.
>  !2019-10-21_22-44-25.png|width=100%!
> The second screenshot shows the actual call hierarchy, but for some reason, the profile
doesn't properly dive into the `equals` method here and doesn't count the time spent in `equalsWithoutType`.
>  !2019-10-21_22-23-37.png|width=100%!! 
> So either the method shouldn't be called that often - or - it should be way faster.
> In the example, I have like 1800 CAS instances and their type system has upwards of 200
types.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Mime
View raw message