uima-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Richard Eckart de Castilho (JIRA)" <...@uima.apache.org>
Subject [jira] [Created] (UIMA-3141) Binary CAS format 6 initializes document meta data wrong
Date Fri, 02 Aug 2013 19:45:49 GMT
Richard Eckart de Castilho created UIMA-3141:

             Summary: Binary CAS format 6 initializes document meta data wrong
                 Key: UIMA-3141
                 URL: https://issues.apache.org/jira/browse/UIMA-3141
             Project: UIMA
          Issue Type: Bug
          Components: Core Java Framework
    Affects Versions: 2.4.1SDK
            Reporter: Richard Eckart de Castilho
            Assignee: Marshall Schor

When a custom document annotation type is used, the language is not properly restored after
deserializing from CAS format 6.

Expected: deserialized CAS has language "latin"

Actual: deserialized CAS has language "x-unspecified"

If the line {{sourceCas.addFsToIndexes(ma);}} is commented out, the code works.

import static org.junit.Assert.assertEquals;
import static org.junit.Assert.assertTrue;

import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.InputStream;
import java.io.OutputStream;

import org.apache.commons.io.IOUtils;
import org.apache.uima.cas.CAS;
import org.apache.uima.cas.impl.Serialization;
import org.apache.uima.cas.text.AnnotationFS;
import org.apache.uima.resource.metadata.TypeSystemDescription;
import org.apache.uima.resource.metadata.impl.TypeSystemDescription_impl;
import org.apache.uima.util.CasCreationUtils;
import org.junit.Rule;
import org.junit.Test;
import org.junit.rules.TemporaryFolder;

public class MinimalTest
    public TemporaryFolder testFolder = new TemporaryFolder();

    public void test()
        throws Exception
        TypeSystemDescription sourceTsd = new TypeSystemDescription_impl();
        sourceTsd.addType("DocMeta", "", CAS.TYPE_NAME_DOCUMENT_ANNOTATION);
        TypeSystemDescription targetTsd = new TypeSystemDescription_impl();

        CAS sourceCas = CasCreationUtils.createCas(sourceTsd, null, null);
        AnnotationFS ma = sourceCas.createAnnotation(sourceCas.getTypeSystem().getType("DocMeta"),
                0, 0);

        File file = testFolder.newFile("test.bin");

        OutputStream os = new FileOutputStream(file);
        Serialization.serializeWithCompression(sourceCas, os, sourceCas.getTypeSystem());

        assertTrue(new File(testFolder.getRoot(), "test.bin").exists());

        CAS targetCas = CasCreationUtils.createCas(targetTsd, null, null);
        InputStream is = new FileInputStream(file);
        Serialization.deserializeCAS(targetCas, is, sourceCas.getTypeSystem(), null);

        assertEquals("latin", targetCas.getDocumentLanguage());
        assertEquals("test", targetCas.getDocumentText());

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message