uima-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Marshall Schor (JIRA)" <...@uima.apache.org>
Subject [jira] [Commented] (UIMA-3017) Getting feature value from feature structure longer than expected
Date Mon, 01 Jul 2013 18:40:23 GMT

    [ https://issues.apache.org/jira/browse/UIMA-3017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13697053#comment-13697053
] 

Marshall Schor commented on UIMA-3017:
--------------------------------------

Some comments:

Creating a JCAS Feature Structure reference involves creating a JCas Java object for that
type.  There is a "feature" which attempts to reuse JCas Java objects for that instance of
that type, by looking them up - this is the JCas cache.  This lookup can be disabled, in which
case, every creation of a FS reference object creates a new object (which in modern Javas
is very fast).  See http://uima.apache.org/d/uimaj-2.4.0/tutorials_and_users_guides.html#tug.application.pto
for more details.

For any particular type definition, you may manually edit the XXX_Type class definition and
set the value of the boolean "featOkTst" to false, disabling this runtime check done on every
reference.  It's a tiny check, but because there's so little going on to do the reference,
disabling it improves the performance quite a bit in your test.

For instance, you would change:
{code}
public final static boolean featOkTst = JCasRegistry.getFeatOkTst("MyType");
{code}
to
{code}
public final static boolean featOkTst = false;
{code}

Because it's a static final kind of boolean set to false, the compiler will compile-out this
test completely.

                
> Getting feature value from feature structure longer than expected
> -----------------------------------------------------------------
>
>                 Key: UIMA-3017
>                 URL: https://issues.apache.org/jira/browse/UIMA-3017
>             Project: UIMA
>          Issue Type: Improvement
>          Components: Core Java Framework
>    Affects Versions: 2.3
>         Environment: Linux x86_64
>            Reporter: Mike Barborak
>            Priority: Minor
>
> Should getting a value of a feature in a feature structure be fast? Intuitively, I would
expect performance to be about the same as getting an entry from a Java HashMap or faster
but in my experiments it seems to be 8 times slower. To solve my problem, I wrap my feature
structures with caching Java code but it seems that there might be an opportunity to speed
up UIMA generally.
> My test creates a CAS with a single feature structure in it. It sets a string feature
in that feature structure and then simply gets the value of that feature in a tight loop.
I compare that to an instance of a Java class that has an internal HashMap of strings to strings.
In that case, a method is called on that instance to get an entry from the map in a very tight
loop. 
> I do 5 rounds of each of the loops. The total times for the rounds involving the CAS
were:
> round 0 total time 1: 7.520104509s
> round 1 total time 1: 6.812214938s
> round 2 total time 1: 6.882752307s
> round 3 total time 1: 6.728515004s
> round 4 total time 1: 6.813674956s
> The total times for the rounds just using the Java class were:
> round 0 total time 2: 0.847296054s
> round 1 total time 2: 0.814570347s
> round 2 total time 2: 0.814399859s
> round 3 total time 2: 0.814189383s
> round 4 total time 2: 0.814979357s
> Here is my Java code:
> {code:title=MyTest.java}
> package test;
> import java.io.InputStream;
> import java.util.HashMap;
> import java.util.Map;
> import org.apache.uima.UIMAFramework;
> import org.apache.uima.cas.CAS;
> import org.apache.uima.cas.Feature;
> import org.apache.uima.cas.FeatureStructure;
> import org.apache.uima.cas.Type;
> import org.apache.uima.resource.metadata.TypeSystemDescription;
> import org.apache.uima.util.CasCreationUtils;
> import org.apache.uima.util.XMLInputSource;
> public class MyTest {
>   
>   static class MyClass {
>     Map<String, String> myFeatures = new HashMap<String, String>();
>     
>     void setStringValue(String feature, String value) {
>       myFeatures.put(feature, value);
>     }
>     
>     String getStringValue(String feature) {
>       return myFeatures.get(feature);
>     }
>   }
>   
>   static public void main(String[] argv) throws Exception {
>     InputStream stream = TestSupport.class.getClassLoader().getResourceAsStream("MyTypes.xml");
>     TypeSystemDescription typeSystemDescription = UIMAFramework.getXMLParser().parseTypeSystemDescription(new
XMLInputSource(stream, null));
>     CAS cas = CasCreationUtils.createCas(typeSystemDescription, null, null);
>     Type myType = cas.getTypeSystem().getType("MyType");
>     FeatureStructure fs = cas.createFS(myType);
>     Feature myFeature = myType.getFeatureByBaseName("myFeature");
>     fs.setStringValue(myFeature, "myString");
>     cas.addFsToIndexes(fs);
>     
>     MyClass myInstance = new MyClass();
>     myInstance.setStringValue("myFeature2", "myString2");
>     
>     long iterations = 100000000;
>     double nanoSecsPerSec = 1000000000.0d;
>     
>     for (int round = 0; round < 5; round++) {
>       long start = System.nanoTime();
>       for (long i = 0; i < iterations; i++) {
>         fs.getStringValue(myFeature);
>       }
>       long end = System.nanoTime();
>       System.out.println("round " + round + " total time 1: " + ((end - start) / nanoSecsPerSec)
+ "s");
>     }
>       
>     for (int round = 0; round < 5; round++) {
>       long start = System.nanoTime();
>       for (long i = 0; i < iterations; i++) {
>         myInstance.getStringValue("myFeature2");
>       }
>       long end = System.nanoTime();
>       System.out.println("round " + round + " total time 2: " + ((end - start) / nanoSecsPerSec)
+ "s");
>     }
>   }
> }
> {code}
> Here is my type descriptor:
> {code:xml}
> <?xml version="1.0" encoding="UTF-8"?>
> <typeSystemDescription xmlns="http://uima.apache.org/resourceSpecifier">
>   <name>MyTypes</name>
>   <description/>
>   <version>1.0</version>
>   <vendor/>
>   <types>
>     <typeDescription>
>       <name>MyType</name>
>       <description/>
>       <supertypeName>uima.cas.TOP</supertypeName>
>       <features>
>         <featureDescription>
>           <name>myFeature</name>
>           <description></description>
>           <rangeTypeName>uima.cas.String</rangeTypeName>
>         </featureDescription>
>       </features>
>     </typeDescription>
>   </types>
> </typeSystemDescription>
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message