uima-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Miguel Alvarez" <miguelal...@gmail.com>
Subject RE: Ruta: reloadScript and external engines
Date Tue, 09 Feb 2016 18:45:19 GMT
Done

-----Original Message-----
From: Peter Klügl [mailto:peter.kluegl@averbis.com] 
Sent: February 8, 2016 2:50
To: dev@uima.apache.org
Subject: Re: Ruta: reloadScript and external engines

Hi,

if there were files attached to this mail, they have been removed. You could attach them to
the jira issue.

Best,

Peter

Am 05.02.2016 um 17:00 schrieb Miguel Alvarez:
> Find attached the source files...
>
> I assume the extensions don't support all the types of parameters (unless you tell me
otherwise :-) ), so I had to convert the resource parameters to simple stringerxpressions,
and the feature assignments to string/integer parameter pairs...
>
> -----Original Message-----
> From: Peter Klügl [mailto:peter.kluegl@averbis.com]
> Sent: February 5, 2016 7:23
> To: dev@uima.apache.org
> Subject: Re: Ruta: reloadScript and external engines
>
> hi,
>
> ah ok. I must admit that I was not hure how this issue can be implemented since the connection
to the file is lost. The wordlist resources are in need of refactoring. I am curious how you
solved it.
>
> Best,
>
> Peter
>
> Am 05.02.2016 um 16:03 schrieb Miguel Alvarez:
>> I created two extensions that are the replica of markfast and 
>> marktable but keep the last modified date of the txt or csv files and 
>> reload them if it changes. They work well and this way I don't need 
>> to have the reloadScript set to true.
>>
>> I created a Jira enhancement for this.
>>
>> Cheers
>> Miguel
>> On Feb 2, 2016 06:11, "Miguel Alvarez" <miguelal007@gmail.com> wrote:
>>
>>> Hi Peter
>>>
>>> Thanks again for your reply. I am trying to get more familiar with 
>>> the RUTA code so I can make these changes myself, but I am not there 
>>> yet, there are a lot of things I still don't understand :) I will 
>>> create a Jira issue for now.
>>>
>>> I actually already tried yesterday the workaround you suggested but 
>>> it doesn't seem to be working either. I have refactored the code so 
>>> the dictionary logic is in separate scripts and set their parameter 
>>> to true, only for those scripts. But the rest of the scripts have it 
>>> set to false including the main script (the one that starts the 
>>> chain of scripts), but it seems that unless I set all the scripts to 
>>> true the dictionaries don't get reloaded.
>>>
>>> But now that I am thinking about it I should be able to create 
>>> extensions that allow us to reload the dictionaries only, and that 
>>> way we can leave that parameter to false for all the engines.
>>>
>>> I will let you know if that works.
>>>
>>> Cheers
>>> Miguel
>>> On Feb 1, 2016 23:54, "Peter Klügl" <peter.kluegl@averbis.com> wrote:
>>>
>>>> Hi,
>>>>
>>>> the analysis engines do not necessarily need to be created anew. If 
>>>> you want, you can create a jira issue for it and I will fix it.
>>>>
>>>> There is currently no option to reload the dictionaries separately. 
>>>> As a workaround, you could extract/refactor all dictionary logic to 
>>>> a separate ruta analysis engine/script and then only set to 
>>>> parameter to true for this one.
>>>>
>>>> Best,
>>>>
>>>> Peter
>>>>
>>>> Am 01.02.2016 um 20:04 schrieb Miguel Alvarez:
>>>>> Hi Peter,
>>>>>
>>>>> Thanks for your reply.
>>>>>
>>>>> It isn't necessarily causing any problems, I just wanted to 
>>>>> understand
>>>> how
>>>>> it was meant to work. But I can explain to you a bit better my
>>>> situation,
>>>>> and maybe you have a better suggestion.
>>>>>
>>>>> We are currently setting the parameter reloadScript to true in our 
>>>>> RUTA engines so the dictionaries reload without us having to 
>>>>> restart the
>>>> service.
>>>>> But we have some external engines, invoked from RUTA scripts, 
>>>>> which
>>>> create
>>>>> connections to other servers, and until now we have been storing 
>>>>> this connections as class instance variables in our external 
>>>>> engines so they
>>>> can
>>>>> be reused and the engine doesn't need to create a new connection 
>>>>> for
>>>> every
>>>>> document processed. And the initialize method checks whether the 
>>>>> engine instance has already an open connection, so no matter how 
>>>>> many times the initialize method is invoked only one connection is established.
>>>>>
>>>>> But if we invoke this external engine from a RUTA script that has 
>>>>> the reloadScript parameter set to true, a new instance of the 
>>>>> engine is
>>>> created
>>>>> for every document processed, and therefore a new connection to 
>>>>> the
>>>> remote
>>>>> server will be established for each document too, regardless of my
>>>> check for
>>>>> an existing connection in the initialize method (obviously because 
>>>>> it
>>>> is a
>>>>> brand new instance every time).
>>>>>
>>>>> I guess one question I have is: Can we force the reload of 
>>>>> dictionaries directly from the RUTA script or in any other way?
>>>>>
>>>>> Thanks,
>>>>> Miguel
>>>>>
>>>>> -----Original Message-----
>>>>> From: Peter Klügl [mailto:peter.kluegl@averbis.com]
>>>>> Sent: February 1, 2016 10:00
>>>>> To: dev@uima.apache.org
>>>>> Subject: Re: Ruta: reloadScript and external engines
>>>>>
>>>>> Hi,
>>>>>
>>>>> yes, this is correct and intended, but it is missing the original 
>>>>> idea
>>>> and
>>>>> thus it is probably not necessary.
>>>>>
>>>>> The reload of the script needs to be supported for some special 
>>>>> use
>>>> cases
>>>>> like changing the rules during the pipeline processing. The 
>>>>> additional analysis engine are however directly specified in the 
>>>>> configuration parameters and should only be changed with reconfigure().
>>>>>
>>>>> Does this cause problems? I would not change it right now because 
>>>>> I am
>>>> also
>>>>> thinking about removing these parameters at all in a next major 
>>>>> release since they are redundant and could be induced using the script
files.
>>>> The
>>>>> objects could be cached, but the initialize is normally the 
>>>>> expensive
>>>> part.
>>>>> Best,
>>>>>
>>>>> Peter
>>>>>
>>>>> Am 30.01.2016 um 23:33 schrieb Miguel Alvarez:
>>>>>> Hi Peter,
>>>>>>
>>>>>>
>>>>>>
>>>>>> I have another question about external engines :- ) When the 
>>>>>> reloadScript parameter is set to false only one instance of the 
>>>>>> external engine is created and the initialize method is invoked 
>>>>>> only once before processing all the CASes. This is what I was expecting.
>>>>>> But when the reloadScript is set to true the initialize method of

>>>>>> the external engines is invoked once per CAS, as the 
>>>>>> documentation indicates, but it looks like a new instance of the

>>>>>> external engine is
>>>>> created for each CAS too. Is this the expected behaviour?
>>>>>> I was expecting for RUTA to create just once instance of the 
>>>>>> engine, and then on that instance invoke the initialize method 
>>>>>> once per CAS, but I couldn't find any information about this on the
documentation.
>>>>>>
>>>>>>
>>>>>>
>>>>>> Thanks again.
>>>>>>
>>>>>>
>>>>>>
>>>>>> Cheers,
>>>>>>
>>>>>> Miguel
>>>>>>
>>>>>>



Mime
View raw message