uima-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Peter Klügl <peter.klu...@averbis.com>
Subject Re: Ruta: reloadScript and external engines
Date Fri, 05 Feb 2016 15:23:16 GMT
hi,

ah ok. I must admit that I was not hure how this issue can be
implemented since the connection to the file is lost. The wordlist
resources are in need of refactoring. I am curious how you solved it.

Best,

Peter

Am 05.02.2016 um 16:03 schrieb Miguel Alvarez:
> I created two extensions that are the replica of markfast and marktable but
> keep the last modified date of the txt or csv files and reload them if it
> changes. They work well and this way I don't need to have the reloadScript
> set to true.
>
> I created a Jira enhancement for this.
>
> Cheers
> Miguel
> On Feb 2, 2016 06:11, "Miguel Alvarez" <miguelal007@gmail.com> wrote:
>
>> Hi Peter
>>
>> Thanks again for your reply. I am trying to get more familiar with the
>> RUTA code so I can make these changes myself, but I am not there yet, there
>> are a lot of things I still don't understand :) I will create a Jira issue
>> for now.
>>
>> I actually already tried yesterday the workaround you suggested but it
>> doesn't seem to be working either. I have refactored the code so the
>> dictionary logic is in separate scripts and set their parameter to true,
>> only for those scripts. But the rest of the scripts have it set to false
>> including the main script (the one that starts the chain of scripts), but
>> it seems that unless I set all the scripts to true the dictionaries don't
>> get reloaded.
>>
>> But now that I am thinking about it I should be able to create extensions
>> that allow us to reload the dictionaries only, and that way we can leave
>> that parameter to false for all the engines.
>>
>> I will let you know if that works.
>>
>> Cheers
>> Miguel
>> On Feb 1, 2016 23:54, "Peter Klügl" <peter.kluegl@averbis.com> wrote:
>>
>>> Hi,
>>>
>>> the analysis engines do not necessarily need to be created anew. If you
>>> want, you can create a jira issue for it and I will fix it.
>>>
>>> There is currently no option to reload the dictionaries separately. As a
>>> workaround, you could extract/refactor all dictionary logic to a
>>> separate ruta analysis engine/script and then only set to parameter to
>>> true for this one.
>>>
>>> Best,
>>>
>>> Peter
>>>
>>> Am 01.02.2016 um 20:04 schrieb Miguel Alvarez:
>>>> Hi Peter,
>>>>
>>>> Thanks for your reply.
>>>>
>>>> It isn't necessarily causing any problems, I just wanted to understand
>>> how
>>>> it was meant to work. But I can explain to you a bit better my
>>> situation,
>>>> and maybe you have a better suggestion.
>>>>
>>>> We are currently setting the parameter reloadScript to true in our RUTA
>>>> engines so the dictionaries reload without us having to restart the
>>> service.
>>>> But we have some external engines, invoked from RUTA scripts, which
>>> create
>>>> connections to other servers, and until now we have been storing this
>>>> connections as class instance variables in our external engines so they
>>> can
>>>> be reused and the engine doesn't need to create a new connection for
>>> every
>>>> document processed. And the initialize method checks whether the engine
>>>> instance has already an open connection, so no matter how many times the
>>>> initialize method is invoked only one connection is established.
>>>>
>>>> But if we invoke this external engine from a RUTA script that has the
>>>> reloadScript parameter set to true, a new instance of the engine is
>>> created
>>>> for every document processed, and therefore a new connection to the
>>> remote
>>>> server will be established for each document too, regardless of my
>>> check for
>>>> an existing connection in the initialize method (obviously because it
>>> is a
>>>> brand new instance every time).
>>>>
>>>> I guess one question I have is: Can we force the reload of dictionaries
>>>> directly from the RUTA script or in any other way?
>>>>
>>>> Thanks,
>>>> Miguel
>>>>
>>>> -----Original Message-----
>>>> From: Peter Klügl [mailto:peter.kluegl@averbis.com]
>>>> Sent: February 1, 2016 10:00
>>>> To: dev@uima.apache.org
>>>> Subject: Re: Ruta: reloadScript and external engines
>>>>
>>>> Hi,
>>>>
>>>> yes, this is correct and intended, but it is missing the original idea
>>> and
>>>> thus it is probably not necessary.
>>>>
>>>> The reload of the script needs to be supported for some special use
>>> cases
>>>> like changing the rules during the pipeline processing. The additional
>>>> analysis engine are however directly specified in the configuration
>>>> parameters and should only be changed with reconfigure().
>>>>
>>>> Does this cause problems? I would not change it right now because I am
>>> also
>>>> thinking about removing these parameters at all in a next major release
>>>> since they are redundant and could be induced using the script files.
>>> The
>>>> objects could be cached, but the initialize is normally the expensive
>>> part.
>>>> Best,
>>>>
>>>> Peter
>>>>
>>>> Am 30.01.2016 um 23:33 schrieb Miguel Alvarez:
>>>>> Hi Peter,
>>>>>
>>>>>
>>>>>
>>>>> I have another question about external engines :- ) When the
>>>>> reloadScript parameter is set to false only one instance of the
>>>>> external engine is created and the initialize method is invoked only
>>>>> once before processing all the CASes. This is what I was expecting.
>>>>> But when the reloadScript is set to true the initialize method of the
>>>>> external engines is invoked once per CAS, as the documentation
>>>>> indicates, but it looks like a new instance of the external engine is
>>>> created for each CAS too. Is this the expected behaviour?
>>>>> I was expecting for RUTA to create just once instance of the engine,
>>>>> and then on that instance invoke the initialize method once per CAS,
>>>>> but I couldn't find any information about this on the documentation.
>>>>>
>>>>>
>>>>>
>>>>> Thanks again.
>>>>>
>>>>>
>>>>>
>>>>> Cheers,
>>>>>
>>>>> Miguel
>>>>>
>>>>>
>>>


Mime
View raw message