uima-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Peter Klügl <peter.klu...@averbis.com>
Subject Re: Ruta: reloadScript and external engines
Date Mon, 08 Feb 2016 10:49:32 GMT
Hi,

if there were files attached to this mail, they have been removed. You
could attach them to the jira issue.

Best,

Peter

Am 05.02.2016 um 17:00 schrieb Miguel Alvarez:
> Find attached the source files...
>
> I assume the extensions don't support all the types of parameters (unless you tell me
otherwise :-) ), so I had to convert the resource parameters to simple stringerxpressions,
and the feature assignments to string/integer parameter pairs...
>
> -----Original Message-----
> From: Peter Klügl [mailto:peter.kluegl@averbis.com] 
> Sent: February 5, 2016 7:23
> To: dev@uima.apache.org
> Subject: Re: Ruta: reloadScript and external engines
>
> hi,
>
> ah ok. I must admit that I was not hure how this issue can be implemented since the connection
to the file is lost. The wordlist resources are in need of refactoring. I am curious how you
solved it.
>
> Best,
>
> Peter
>
> Am 05.02.2016 um 16:03 schrieb Miguel Alvarez:
>> I created two extensions that are the replica of markfast and 
>> marktable but keep the last modified date of the txt or csv files and 
>> reload them if it changes. They work well and this way I don't need to 
>> have the reloadScript set to true.
>>
>> I created a Jira enhancement for this.
>>
>> Cheers
>> Miguel
>> On Feb 2, 2016 06:11, "Miguel Alvarez" <miguelal007@gmail.com> wrote:
>>
>>> Hi Peter
>>>
>>> Thanks again for your reply. I am trying to get more familiar with 
>>> the RUTA code so I can make these changes myself, but I am not there 
>>> yet, there are a lot of things I still don't understand :) I will 
>>> create a Jira issue for now.
>>>
>>> I actually already tried yesterday the workaround you suggested but 
>>> it doesn't seem to be working either. I have refactored the code so 
>>> the dictionary logic is in separate scripts and set their parameter 
>>> to true, only for those scripts. But the rest of the scripts have it 
>>> set to false including the main script (the one that starts the chain 
>>> of scripts), but it seems that unless I set all the scripts to true 
>>> the dictionaries don't get reloaded.
>>>
>>> But now that I am thinking about it I should be able to create 
>>> extensions that allow us to reload the dictionaries only, and that 
>>> way we can leave that parameter to false for all the engines.
>>>
>>> I will let you know if that works.
>>>
>>> Cheers
>>> Miguel
>>> On Feb 1, 2016 23:54, "Peter Klügl" <peter.kluegl@averbis.com> wrote:
>>>
>>>> Hi,
>>>>
>>>> the analysis engines do not necessarily need to be created anew. If 
>>>> you want, you can create a jira issue for it and I will fix it.
>>>>
>>>> There is currently no option to reload the dictionaries separately. 
>>>> As a workaround, you could extract/refactor all dictionary logic to 
>>>> a separate ruta analysis engine/script and then only set to 
>>>> parameter to true for this one.
>>>>
>>>> Best,
>>>>
>>>> Peter
>>>>
>>>> Am 01.02.2016 um 20:04 schrieb Miguel Alvarez:
>>>>> Hi Peter,
>>>>>
>>>>> Thanks for your reply.
>>>>>
>>>>> It isn't necessarily causing any problems, I just wanted to 
>>>>> understand
>>>> how
>>>>> it was meant to work. But I can explain to you a bit better my
>>>> situation,
>>>>> and maybe you have a better suggestion.
>>>>>
>>>>> We are currently setting the parameter reloadScript to true in our 
>>>>> RUTA engines so the dictionaries reload without us having to 
>>>>> restart the
>>>> service.
>>>>> But we have some external engines, invoked from RUTA scripts, which
>>>> create
>>>>> connections to other servers, and until now we have been storing 
>>>>> this connections as class instance variables in our external 
>>>>> engines so they
>>>> can
>>>>> be reused and the engine doesn't need to create a new connection 
>>>>> for
>>>> every
>>>>> document processed. And the initialize method checks whether the 
>>>>> engine instance has already an open connection, so no matter how 
>>>>> many times the initialize method is invoked only one connection is established.
>>>>>
>>>>> But if we invoke this external engine from a RUTA script that has 
>>>>> the reloadScript parameter set to true, a new instance of the 
>>>>> engine is
>>>> created
>>>>> for every document processed, and therefore a new connection to the
>>>> remote
>>>>> server will be established for each document too, regardless of my
>>>> check for
>>>>> an existing connection in the initialize method (obviously because 
>>>>> it
>>>> is a
>>>>> brand new instance every time).
>>>>>
>>>>> I guess one question I have is: Can we force the reload of 
>>>>> dictionaries directly from the RUTA script or in any other way?
>>>>>
>>>>> Thanks,
>>>>> Miguel
>>>>>
>>>>> -----Original Message-----
>>>>> From: Peter Klügl [mailto:peter.kluegl@averbis.com]
>>>>> Sent: February 1, 2016 10:00
>>>>> To: dev@uima.apache.org
>>>>> Subject: Re: Ruta: reloadScript and external engines
>>>>>
>>>>> Hi,
>>>>>
>>>>> yes, this is correct and intended, but it is missing the original 
>>>>> idea
>>>> and
>>>>> thus it is probably not necessary.
>>>>>
>>>>> The reload of the script needs to be supported for some special use
>>>> cases
>>>>> like changing the rules during the pipeline processing. The 
>>>>> additional analysis engine are however directly specified in the 
>>>>> configuration parameters and should only be changed with reconfigure().
>>>>>
>>>>> Does this cause problems? I would not change it right now because I 
>>>>> am
>>>> also
>>>>> thinking about removing these parameters at all in a next major 
>>>>> release since they are redundant and could be induced using the script
files.
>>>> The
>>>>> objects could be cached, but the initialize is normally the 
>>>>> expensive
>>>> part.
>>>>> Best,
>>>>>
>>>>> Peter
>>>>>
>>>>> Am 30.01.2016 um 23:33 schrieb Miguel Alvarez:
>>>>>> Hi Peter,
>>>>>>
>>>>>>
>>>>>>
>>>>>> I have another question about external engines :- ) When the 
>>>>>> reloadScript parameter is set to false only one instance of the 
>>>>>> external engine is created and the initialize method is invoked 
>>>>>> only once before processing all the CASes. This is what I was expecting.
>>>>>> But when the reloadScript is set to true the initialize method of

>>>>>> the external engines is invoked once per CAS, as the documentation

>>>>>> indicates, but it looks like a new instance of the external engine

>>>>>> is
>>>>> created for each CAS too. Is this the expected behaviour?
>>>>>> I was expecting for RUTA to create just once instance of the 
>>>>>> engine, and then on that instance invoke the initialize method 
>>>>>> once per CAS, but I couldn't find any information about this on the
documentation.
>>>>>>
>>>>>>
>>>>>>
>>>>>> Thanks again.
>>>>>>
>>>>>>
>>>>>>
>>>>>> Cheers,
>>>>>>
>>>>>> Miguel
>>>>>>
>>>>>>


Mime
View raw message