From uima-dev-return-436-apmail-incubator-uima-dev-archive=incubator.apache.org@incubator.apache.org Mon Nov 27 15:38:03 2006 Return-Path: Delivered-To: apmail-incubator-uima-dev-archive@locus.apache.org Received: (qmail 75527 invoked from network); 27 Nov 2006 15:38:03 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 27 Nov 2006 15:38:03 -0000 Received: (qmail 77694 invoked by uid 500); 27 Nov 2006 15:38:03 -0000 Delivered-To: apmail-incubator-uima-dev-archive@incubator.apache.org Received: (qmail 77672 invoked by uid 500); 27 Nov 2006 15:38:03 -0000 Mailing-List: contact uima-dev-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: uima-dev@incubator.apache.org Delivered-To: mailing list uima-dev@incubator.apache.org Received: (qmail 77641 invoked by uid 99); 27 Nov 2006 15:38:03 -0000 Received: from herse.apache.org (HELO herse.apache.org) (140.211.11.133) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 27 Nov 2006 07:38:02 -0800 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (herse.apache.org: domain of lally.adam@gmail.com designates 64.233.184.236 as permitted sender) Received: from [64.233.184.236] (HELO wr-out-0506.google.com) (64.233.184.236) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 27 Nov 2006 07:37:50 -0800 Received: by wr-out-0506.google.com with SMTP id i32so165714wra for ; Mon, 27 Nov 2006 07:37:30 -0800 (PST) DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:sender:to:subject:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references:x-google-sender-auth; b=YzWS28ClPTbEG9+2ygzRbN3xxQZH8NiKN6W1UdZzX3Yi/eQD17xgFerBAjlF1XAFdpCI1aBeZlDmFHvwUoTJRofwOI1yTF63XueRhUtYwHXK0+mWbBZVpgPsQgrEJmGffhFk+23cgEWJyJAu+XIxCgclxKJOyyUz1phCqKXLS2I= Received: by 10.90.55.19 with SMTP id d19mr10080667aga.1164641849923; Mon, 27 Nov 2006 07:37:29 -0800 (PST) Received: by 10.90.83.4 with HTTP; Mon, 27 Nov 2006 07:37:29 -0800 (PST) Message-ID: <2787e08a0611270737o2dd67021vb9554e180535f4d0@mail.gmail.com> Date: Mon, 27 Nov 2006 10:37:29 -0500 From: "Adam Lally" Sender: lally.adam@gmail.com To: uima-dev@incubator.apache.org Subject: Re: Result specification - update needed In-Reply-To: <45686B48.8040607@schor.com> MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Content-Disposition: inline References: <45686B48.8040607@schor.com> X-Google-Sender-Auth: 87b2cfaba2ffbb3a X-Virus-Checked: Checked by ClamAV on apache.org On 11/25/06, Marshall Schor wrote: > I need to write up the version 2 tutorial and user's guide for Results > Specification. The current write up is inaccurate, I think. I started > to change it to fit the new API where it is not passed in as a > parameter, but there are more things that need fixing. > > Could Adam and/or Thilo take a look at this write up and fix it up? > (see below): > Yes, this needed an overhaul. Result Specifcation handling in aggregates no longer has anything to do with the type of flow. Here's my suggested documentation (note I used tags for monospace font as in HTML, I have no idea if that's right for docbook):
Result Specification Setting The Result Specification is passed to the annotator instance by calling its setResultSpecificaiton method. When called, the default implementation saves the result specification in an instance variable of the Annotator instance, which can be accessed by the annotator using the protected getResultSpecification() method. A Result Specification is a list of output types and / or type:feature names, which are expected to be output from the annotator. Annotators may use this to optimize their operations, when possible, for those cases where only particular outputs are wanted. The interface to the Result Specification object (see the JavaDocs) allows querying both types and particular features of types. Sometimes you can specify the Result Specification; othertimes, you cannot (for instance, inside a Collection Processing Engine, you cannot). When you cannot specify it, or choose not to specify it (for example, using the form of the process(...) call on an Analysis Engine that doesn't include the Result Specification), a Default Result Specification is used.
Default ResultSpecification The default Result Specification is taken from the Engine's output Capability Specification. Remember that a Capability Specification has both inputs and outputs, can specify types and / or features, and there can be more than one Capability Set. If there is more than one set, the logical union of these sets is used. The default Result Specification is exactly what's included in the output Capability Specification.
Passing Result Specifications to Analysis Engines If you are not using a Collection Processing Engine, you can specify a Result Specification for your AnalysisEngine(s) by calling the AnalysisEngine.setResultSpecification(ResultSpecification) method. It is also possible to pass a Result Specification on each call to AnalysisEngine.process(CAS, ResultSpecification). However, this is not recommended if your Result Specification will stay constant across multiple calls to process. In that case it will be more efficient to call AnalysisEngine.setResultSpecification(ResultSpecification) only when the Result Specification changes. For primitive Analysis Engines, whatever Result Specification you pass in is passed along to the annotator's setResultSpecification(ResultSpecification) method. For aggregate Analysis Engines, see below.
Aggregates For aggregate engines, the Result Specification passed to the AnalysisEngine.setResultSpecification(ResultSpecification) method is intended to specify the set of output types/features that the aggregate should produce. This is not necessarily equivalent to the set of output types/features that each annotator should produce. For example, an annotator may need to produce an intermediate type that is then consumed by a downstream annotator, even though that intermediate type is not part of the Result Specification. To handle this situation, when AnalysisEngine.setResultSpecification(ResultSpecification) is called on an aggregate, the framework computes the union of the passed Result Specification with the set of all input types and features of all component AnalysisEngines within that aggregate. This forms the complete set of types and features that any component of the aggregate might need to produce. This derived Result Specification is then passed to the AnalysisEngine.setResultSpecification(ResultSpecification) of each component AnalysisEngine. In the case of nested aggregates, this procedure is applied recursively.
Collection Proessing Engines The Default Result Specification is always used for all components of a Collection Processing Engine.