uima-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marshall Schor <...@schor.com>
Subject Re: [jira] [Commented] (UIMA-3141) Binary CAS format 6 + type filtering fails to deserialize document annotation correctly
Date Sun, 04 Aug 2013 22:28:31 GMT

On 8/4/2013 6:11 PM, Richard Eckart de Castilho (JIRA) wrote:
>     [ https://issues.apache.org/jira/browse/UIMA-3141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13729011#comment-13729011
] 
>
> Richard Eckart de Castilho commented on UIMA-3141:
> --------------------------------------------------
>
> If the custom sub-type of DocumentAnnotation is part of the target type system, it works
(not verified in exactly the given test case, but in the context form which this test was
distilled).
>
> Since the document annotation is a special annotation in UIMA, it may require special
handling. I would expect that all features are set if they are available on the document annotation,
even if the type of the document annotation is not the same.
I'm not following... All features of DocMeta are set.  It's the entire type
instance of DocMeta that's being "filtered out" when deserializing.

I'm probably not understanding your point correctly though - please say more .

-Marshall
>                 
>> Binary CAS format 6 + type filtering fails to deserialize document annotation correctly

>> ----------------------------------------------------------------------------------------
>>
>>                 Key: UIMA-3141
>>                 URL: https://issues.apache.org/jira/browse/UIMA-3141
>>             Project: UIMA
>>          Issue Type: Bug
>>          Components: Core Java Framework
>>    Affects Versions: 2.4.1SDK
>>            Reporter: Richard Eckart de Castilho
>>            Assignee: Marshall Schor
>>
>> When a custom document annotation type is used, the language is not properly restored
after deserializing from CAS format 6.
>> Expected: deserialized CAS has language "latin"
>> Actual: deserialized CAS has language "x-unspecified"
>> If the line {{sourceCas.addFsToIndexes(ma);}} is commented out, the code works.
>> {code}
>> import static org.junit.Assert.assertEquals;
>> import static org.junit.Assert.assertTrue;
>> import java.io.File;
>> import java.io.FileInputStream;
>> import java.io.FileOutputStream;
>> import java.io.InputStream;
>> import java.io.OutputStream;
>> import org.apache.commons.io.IOUtils;
>> import org.apache.uima.cas.CAS;
>> import org.apache.uima.cas.impl.Serialization;
>> import org.apache.uima.cas.text.AnnotationFS;
>> import org.apache.uima.resource.metadata.TypeSystemDescription;
>> import org.apache.uima.resource.metadata.impl.TypeSystemDescription_impl;
>> import org.apache.uima.util.CasCreationUtils;
>> import org.junit.Rule;
>> import org.junit.Test;
>> import org.junit.rules.TemporaryFolder;
>> public class MinimalTest
>> {
>>     @Rule
>>     public TemporaryFolder testFolder = new TemporaryFolder();
>>     @Test
>>     public void test()
>>         throws Exception
>>     {
>>         TypeSystemDescription sourceTsd = new TypeSystemDescription_impl();
>>         sourceTsd.addType("DocMeta", "", CAS.TYPE_NAME_DOCUMENT_ANNOTATION);
>>         TypeSystemDescription targetTsd = new TypeSystemDescription_impl();
>>         CAS sourceCas = CasCreationUtils.createCas(sourceTsd, null, null);
>>         AnnotationFS ma = sourceCas.createAnnotation(sourceCas.getTypeSystem().getType("DocMeta"),
>>                 0, 0);
>>         sourceCas.addFsToIndexes(ma);
>>         sourceCas.setDocumentLanguage("latin");
>>         sourceCas.setDocumentText("test");
>>         File file = testFolder.newFile("test.bin");
>>         OutputStream os = new FileOutputStream(file);
>>         Serialization.serializeWithCompression(sourceCas, os, sourceCas.getTypeSystem());
>>         IOUtils.closeQuietly(os);
>>         assertTrue(new File(testFolder.getRoot(), "test.bin").exists());
>>         CAS targetCas = CasCreationUtils.createCas(targetTsd, null, null);
>>         InputStream is = new FileInputStream(file);
>>         Serialization.deserializeCAS(targetCas, is, sourceCas.getTypeSystem(), null);
>>         IOUtils.closeQuietly(is);
>>         assertEquals("latin", targetCas.getDocumentLanguage());
>>         assertEquals("test", targetCas.getDocumentText());
>>     }
>> }
>> {code}
> --
> This message is automatically generated by JIRA.
> If you think it was sent incorrectly, please contact your JIRA administrators
> For more information on JIRA, see: http://www.atlassian.com/software/jira
>


Mime
View raw message