uima-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Peter Klügl <peter.klu...@averbis.com>
Subject Re: UIMA RUTA - Custom BLOCK extension
Date Mon, 14 Dec 2015 17:51:20 GMT
Hi,

Am 14.12.2015 um 18:20 schrieb Miguel Alvarez:
> Thanks Peter! That is exactly what I was looking for. I think my code wasn't
> working because of the way I was invoking the constructor, which I didn't
> include in my previous emails. I assume since you have included this in the
> code already, I don't need to do anything to contribute it, right?

Yes... but you are welcome to come up with other great ideas ;-)

I will take of the documentation. I also think about adding the first 
(wrong) variant.

Does anyone have ideas about the naming? If not it will remain 
DOCUMENTBLOCK or something similar.


Best,

Peter

> I hope this new block is useful to many other people.
>
> Thanks,
> Miguel
>
> -----Original Message-----
> From: Peter Klügl [mailto:peter.kluegl@averbis.com]
> Sent: December 14, 2015 4:37
> To: dev@uima.apache.org
> Subject: Re: UIMA RUTA - Custom BLOCK extension
>
> Hi,
>
> sorry, I misinterpreted your use case.
>
> Yes, you are completely right and your code looks correct.
>
> If getList() does not return the matches, then either the rule wasn't able
> to find any anchors at all to start matching, or the apply was called with
> false meaning the matches are not stored for performance reasons. You should
> be able to just delegate to the RutaScriptBlock with a resetted RutaStream:
>
> @Override
>    public ScriptApply apply(RutaStream stream, InferenceCrowd crowd) {
>      CAS cas = stream.getCas();
>      AnnotationFS documentAnnotation = cas.getDocumentAnnotation();
>      RutaStream completeStream =
> stream.getWindowStream(documentAnnotation, documentAnnotation.getType());
>      ScriptApply result = super.apply(completeStream, crowd);
>      return result;
>    }
>
> I added this to the current trunk:
> block impl:
> https://svn.apache.org/repos/asf/uima/ruta/trunk/ruta-core-ext/src/main/java
> /org/apache/uima/ruta/block/DocumentBlock.java
> unit test:
> https://svn.apache.org/repos/asf/uima/ruta/trunk/ruta-core-ext/src/test/java
> /org/apache/uima/ruta/block/DocumentBlockTest.java
>
>
> Does this work for you?
>
> Best,
>
> Peter
>
>
> Am 14.12.2015 um 04:14 schrieb Miguel Alvarez:
>> Hi Peter,
>>
>>   
>>
>> Thanks for your prompt reply.
>>
>>   
>>
>> Let me know if I am wrong, but I don’t think the code you sent would
>> work in case of having the custom BLOCK extension nested inside
>> another block. For instance let’s say we have these annotations in some
> text:
>>   
>>
>> Annotation1 Annotation2 Annotation3 Annotation2 Annotation4
>> Annotation2
>>
>>   
>>
>> BLOCK Annotation3{} {
>>
>>      // Extract some information from Annotation3’s features and store
>> them in variables
>>
>>      DOCUMENTBLOCK Annotation2{} {
>>
>>          // Use the information extracted from Annotation3 to determine
>> if this particular Annotation2 is the one I want
>>
>>      }
>>
>> }
>>
>>   
>>
>> And I actually want the custom BLOCK extension to have the right
>> context when within the BLOCK. So I want the DOCUMENTBLOCK extension
>> to look for
>> Annotation2 in the whole document, but once you are inside the
>> DOCUMENTBLOCK
>> Annotation2 should be the new scope (as the current BLOCK statement
>> does right now).
>>
>>   
>>
>> So initially this is the code I tried:
>>
>>   
>>
>>     public ScriptApply apply(RutaStream stream, InferenceCrowd crowd) {
>>       // Create a new stream of the whole document
>>       RutaStream docStream =
>> stream.getWindowStream(stream.getCas().getDocumentAnnotation(),
>> stream.getCas().getDocumentAnnotation().getType());
>>       BlockApply result = new BlockApply(this);
>>       crowd.beginVisit(this, result);
>>       RuleApply apply = rule.apply(docStream, crowd, true);
>>       for (AbstractRuleMatch<? extends AbstractRule> eachMatch :
>> apply.getList()) {
>>         if (eachMatch.matched()) {
>>           List<AnnotationFS> matchedAnnotations = ((RuleMatch)
>> eachMatch).getMatchedAnnotations(null, null);
>>           if (matchedAnnotations == null || matchedAnnotations.isEmpty()) {
>>             continue;
>>           }
>>           AnnotationFS each = matchedAnnotations.get(0);
>>           if (each == null) {
>>             continue;
>>           }
>>   
>>           List<Type> types = ((RutaRuleElement)
>> rule.getRuleElements().get(0)).getMatcher().getTypes(getParent() == null ?
>> this : getParent(), docStream);
>>           for (Type eachType : types) {
>>             RutaStream window = docStream.getWindowStream(each, eachType);
>>             for (RutaStatement element : getElements()) {
>>               if (element != null) {
>>                 element.apply(window, crowd);
>>                }
>>              }
>>             }
>>           }
>>         }
>>       crowd.endVisit(this, result);
>>       return result;
>>     }
>>
>>   
>>
>> I thought I would just get a new stream that covers the whole
>> document, and apply the rules to that but the call “apply.getList()”
>> would never return anything even though I don’t have any conditions in
>> the RUTA script for the DOCUMENTBLOCK extension. And that is why I
>> ended up calling the method getAllOfType, because that one was working
>> fine, but of course, it doesn’t apply the conditions.
>>
>>   
>>
>> Any ideas why the “getList” wouldn’t return anything even though I am
>> passing a new stream that covers the whole document?
>>
>>   
>>
>> If I get this to work, I have no problems contributing it to the UIMA
>> RUTA project.
>>
>>   
>>
>> Cheers,
>>
>> Miguel
>>
>>   
>>
>>   
>>
>> From: Peter Klügl <
>> <http://gmane.org/get-address.php?address=peter.kluegl%2deqSzvFVgjydBD
>> gjK7y7 TUQ%40public.gmane.org> peter.kluegl@...>
>> Subject:
>> <http://news.gmane.org/find-root.php?message_id=566D5BCE.7070503%40ave
>> rbis.c
>> om> Re: UIMA RUTA - Custom BLOCK extension
>> Newsgroups:  <http://news.gmane.org/gmane.comp.apache.uima.devel>
>> gmane.comp.apache.uima.devel
>> Date: 2015-12-13 11:51:42 GMT (14 hours and 50 minutes ago)
>>
>> Hi,
>>   
>> oh yes, this is a nice extension. I was also already planning to add
>> something like this, but in my use cases the explicit referencing to
>> each matched annotation in the gobal context was missing. Thus, I am
>> implementing the annotation issues first.
>>   
>> It is possible to specify something like this right now in UIMA Ruta
>> but I would not recommend it. You could either spam/remove annotations
>> on the complete document or you could use the recursion functionality
>> of BLOCKs.
>>   
>> Now to the custom block:
>>   
>> You need to apply the head rule of the block in order to evaluate the
>> conditions. The scope is changed by the usage of a new restricted
>> RutaStream (windowStream). In order to retain the scope, just use the
>> given RutaStream.
>>   
>> Without having tested it, it could look something like:
>>   
>>   <at> Override
>>     public ScriptApply apply(RutaStream stream, InferenceCrowd crowd) {
>>       BlockApply result = new BlockApply(this);
>>       crowd.beginVisit(this, result);
>>       RuleApply apply = rule.apply(stream, crowd, true);
>>       for (AbstractRuleMatch<? extends AbstractRule> eachMatch :
>> apply.getList()) {
>>         if (eachMatch.matched()) {
>>             for (RutaStatement element : getElements()) {
>>               if (element != null) {
>>                 element.apply(stream, crowd);
>>             }
>>           }
>>         }
>>       }
>>       crowd.endVisit(this, result);
>>       return result;
>>     }
>>   
>> Let me know if this helps.
>>   
>> Do you want to contribute the block extension?
>>   
>> Best,
>>   
>> Peter
>>   
>> Am 12.12.2015 um 00:04 schrieb Miguel Alvarez:
>>> Hi,
>>>
>>>    
>>>
>>> I am in the process of developing a custom BLOCK extension that
>>> instead of changing the scope of the block, it uses the scope of the
> whole Document.
>>> With this type of BLOCK one could loop through a series of
>>> annotations,
>> and
>>> for each of those annotations search in the whole document for
>>> something else. I guess my first questions is: Is it even possible to
>>> do something like this without creating a custom BLOCK extension?
>>>
>>>    
>>>
>>> I got something to work, but it doesn't seem to apply the conditions
>>> for
>> the
>>> block. This is more or less the code I have so far:
>>>
>>>    
>>>
>>>                 List<Type> types = ((RutaRuleElement)
>>> rule.getRuleElements().get(0)).getMatcher().getTypes(getParent() == null
> ?
>>> this : getParent(), stream);
>>>
>>>                 for (Type eachType : types) {
>>>
>>>                        //System.out.println("each Type: " +
>>> eachType.getShortName());
>>>
>>>                        for(AnnotationFS each :
>> stream.getAllofType(eachType))
>>> {
>>>
>>>                      RutaStream window = stream.getWindowStream(each,
>>> eachType);
>>>
>>>                      for (RutaStatement element : getElements()) {
>>>
>>>                        if (element != null) {
>>>
>>>                          element.apply(window, crowd);
>>>
>>>                        }
>>>
>>>                      }
>>>
>>>                              
>>>
>>>                        }
>>>
>>>                 }
>>>
>>>    
>>>
>>> I assume in order to apply the conditions I would need something like
>> this:
>>>                 RuleApply apply = rule.apply(stream, crowd);
>>>
>>>    
>>>
>>> But for some reason this doesn't work, because I guess the scope has
>> already
>>> been changed and it is not able to find any of the annotations in
>>> within
>> the
>>> scope.
>>>
>>>    
>>>
>>> Does this make any sense? Is there a better way to do this?
>>>
>>>    
>>>
>>> Any help would be much appreciated.
>>>
>>>    
>>>
>>> Cheers,
>>>
>>> Miguel
>>>
>>>
>>   
>>
>>
>


Mime
View raw message