uima-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Miguel Alvarez" <miguelal...@gmail.com>
Subject RE: Ruta: reloadScript and external engines
Date Tue, 16 Feb 2016 07:00:37 GMT
Definitely! Please feel free to use that code.

By the way, when you guys do the voting for a new RC, how does that work? Can anybody vote?

-----Original Message-----
From: Peter Klügl [mailto:peter.kluegl@averbis.com] 
Sent: February 15, 2016 2:38
To: dev@uima.apache.org
Subject: Re: Ruta: reloadScript and external engines

Hi,

sorry, I did not find the time yet to take a look at it.

Maythe file be interpreted as contributions to ruta?

Best,

Peter

Am 09.02.2016 um 19:45 schrieb Miguel Alvarez:
> Done
>
> -----Original Message-----
> From: Peter Klügl [mailto:peter.kluegl@averbis.com]
> Sent: February 8, 2016 2:50
> To: dev@uima.apache.org
> Subject: Re: Ruta: reloadScript and external engines
>
> Hi,
>
> if there were files attached to this mail, they have been removed. You could attach them
to the jira issue.
>
> Best,
>
> Peter
>
> Am 05.02.2016 um 17:00 schrieb Miguel Alvarez:
>> Find attached the source files...
>>
>> I assume the extensions don't support all the types of parameters (unless you tell
me otherwise :-) ), so I had to convert the resource parameters to simple stringerxpressions,
and the feature assignments to string/integer parameter pairs...
>>
>> -----Original Message-----
>> From: Peter Klügl [mailto:peter.kluegl@averbis.com]
>> Sent: February 5, 2016 7:23
>> To: dev@uima.apache.org
>> Subject: Re: Ruta: reloadScript and external engines
>>
>> hi,
>>
>> ah ok. I must admit that I was not hure how this issue can be implemented since the
connection to the file is lost. The wordlist resources are in need of refactoring. I am curious
how you solved it.
>>
>> Best,
>>
>> Peter
>>
>> Am 05.02.2016 um 16:03 schrieb Miguel Alvarez:
>>> I created two extensions that are the replica of markfast and 
>>> marktable but keep the last modified date of the txt or csv files 
>>> and reload them if it changes. They work well and this way I don't 
>>> need to have the reloadScript set to true.
>>>
>>> I created a Jira enhancement for this.
>>>
>>> Cheers
>>> Miguel
>>> On Feb 2, 2016 06:11, "Miguel Alvarez" <miguelal007@gmail.com> wrote:
>>>
>>>> Hi Peter
>>>>
>>>> Thanks again for your reply. I am trying to get more familiar with 
>>>> the RUTA code so I can make these changes myself, but I am not 
>>>> there yet, there are a lot of things I still don't understand :) I 
>>>> will create a Jira issue for now.
>>>>
>>>> I actually already tried yesterday the workaround you suggested but 
>>>> it doesn't seem to be working either. I have refactored the code so 
>>>> the dictionary logic is in separate scripts and set their parameter 
>>>> to true, only for those scripts. But the rest of the scripts have 
>>>> it set to false including the main script (the one that starts the 
>>>> chain of scripts), but it seems that unless I set all the scripts 
>>>> to true the dictionaries don't get reloaded.
>>>>
>>>> But now that I am thinking about it I should be able to create 
>>>> extensions that allow us to reload the dictionaries only, and that 
>>>> way we can leave that parameter to false for all the engines.
>>>>
>>>> I will let you know if that works.
>>>>
>>>> Cheers
>>>> Miguel
>>>> On Feb 1, 2016 23:54, "Peter Klügl" <peter.kluegl@averbis.com> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> the analysis engines do not necessarily need to be created anew. 
>>>>> If you want, you can create a jira issue for it and I will fix it.
>>>>>
>>>>> There is currently no option to reload the dictionaries separately. 
>>>>> As a workaround, you could extract/refactor all dictionary logic 
>>>>> to a separate ruta analysis engine/script and then only set to 
>>>>> parameter to true for this one.
>>>>>
>>>>> Best,
>>>>>
>>>>> Peter
>>>>>
>>>>> Am 01.02.2016 um 20:04 schrieb Miguel Alvarez:
>>>>>> Hi Peter,
>>>>>>
>>>>>> Thanks for your reply.
>>>>>>
>>>>>> It isn't necessarily causing any problems, I just wanted to 
>>>>>> understand
>>>>> how
>>>>>> it was meant to work. But I can explain to you a bit better my
>>>>> situation,
>>>>>> and maybe you have a better suggestion.
>>>>>>
>>>>>> We are currently setting the parameter reloadScript to true in 
>>>>>> our RUTA engines so the dictionaries reload without us having to

>>>>>> restart the
>>>>> service.
>>>>>> But we have some external engines, invoked from RUTA scripts, 
>>>>>> which
>>>>> create
>>>>>> connections to other servers, and until now we have been storing

>>>>>> this connections as class instance variables in our external 
>>>>>> engines so they
>>>>> can
>>>>>> be reused and the engine doesn't need to create a new connection

>>>>>> for
>>>>> every
>>>>>> document processed. And the initialize method checks whether the

>>>>>> engine instance has already an open connection, so no matter how

>>>>>> many times the initialize method is invoked only one connection is
established.
>>>>>>
>>>>>> But if we invoke this external engine from a RUTA script that has

>>>>>> the reloadScript parameter set to true, a new instance of the 
>>>>>> engine is
>>>>> created
>>>>>> for every document processed, and therefore a new connection to 
>>>>>> the
>>>>> remote
>>>>>> server will be established for each document too, regardless of 
>>>>>> my
>>>>> check for
>>>>>> an existing connection in the initialize method (obviously 
>>>>>> because it
>>>>> is a
>>>>>> brand new instance every time).
>>>>>>
>>>>>> I guess one question I have is: Can we force the reload of 
>>>>>> dictionaries directly from the RUTA script or in any other way?
>>>>>>
>>>>>> Thanks,
>>>>>> Miguel
>>>>>>
>>>>>> -----Original Message-----
>>>>>> From: Peter Klügl [mailto:peter.kluegl@averbis.com]
>>>>>> Sent: February 1, 2016 10:00
>>>>>> To: dev@uima.apache.org
>>>>>> Subject: Re: Ruta: reloadScript and external engines
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> yes, this is correct and intended, but it is missing the original

>>>>>> idea
>>>>> and
>>>>>> thus it is probably not necessary.
>>>>>>
>>>>>> The reload of the script needs to be supported for some special 
>>>>>> use
>>>>> cases
>>>>>> like changing the rules during the pipeline processing. The 
>>>>>> additional analysis engine are however directly specified in the

>>>>>> configuration parameters and should only be changed with reconfigure().
>>>>>>
>>>>>> Does this cause problems? I would not change it right now because

>>>>>> I am
>>>>> also
>>>>>> thinking about removing these parameters at all in a next major 
>>>>>> release since they are redundant and could be induced using the script
files.
>>>>> The
>>>>>> objects could be cached, but the initialize is normally the 
>>>>>> expensive
>>>>> part.
>>>>>> Best,
>>>>>>
>>>>>> Peter
>>>>>>
>>>>>> Am 30.01.2016 um 23:33 schrieb Miguel Alvarez:
>>>>>>> Hi Peter,
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> I have another question about external engines :- ) When the

>>>>>>> reloadScript parameter is set to false only one instance of the

>>>>>>> external engine is created and the initialize method is invoked

>>>>>>> only once before processing all the CASes. This is what I was
expecting.
>>>>>>> But when the reloadScript is set to true the initialize method

>>>>>>> of the external engines is invoked once per CAS, as the 
>>>>>>> documentation indicates, but it looks like a new instance of
the 
>>>>>>> external engine is
>>>>>> created for each CAS too. Is this the expected behaviour?
>>>>>>> I was expecting for RUTA to create just once instance of the

>>>>>>> engine, and then on that instance invoke the initialize method

>>>>>>> once per CAS, but I couldn't find any information about this
on the documentation.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Thanks again.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Cheers,
>>>>>>>
>>>>>>> Miguel
>>>>>>>
>>>>>>>
>



Mime
View raw message