Andy,
thanks, I was digging through the code paths both in the latest as well as
the version we currently still have deployed (arq 2.8.5) and it seems that
the only difference is that in 2.8.5, the readImpl of JenaReadRIOT calls
the resolve here:
private void readImpl(Model model, Tokenizer tokenizer, String base)
{
// The reader has been checked, if possible, by now or
// constructed correctly by code here.
if ( base != null )
base = IRIResolver.resolveGlobalToString(base) ;
try {
model.notifyEvent( GraphEvents.startRead );
readWorker(model, tokenizer, base) ;
}
...
whereas in the latest, there is no resolution of the base at all at that
stage:
private void readImpl(Model model, Tokenizer tokenizer, String base)
{
try {
model.notifyEvent( GraphEvents.startRead );
readWorker(model, tokenizer, base) ;
}
...
Now, deeper down, both create an IRIResolverNormal object, so I am
guessing it is safe if I patch our current version to remove the call to
resolveGlobalToString. Any comments? (and before you ask, this is because
we are currently not yet upgrading the jena stack, but I need to fix this
since it leads to concurrency errors during our indexing process)
thanks
Simon
From:
Andy Seaborne <andy@apache.org>
To:
Simon Helsen/Toronto/IBM@IBMCA
Cc:
jena-dev@incubator.apache.org, Andy Seaborne
<andy.seaborne.apache@gmail.com>
Date:
11/03/2011 04:15 AM
Subject:
Re: IRIResolver cache not thread safe?
On 02/11/11 20:06, Simon Helsen wrote:
> However, independent of that, when I look at IRIResolverNormal, it has a
> cache
>
> *private*Cache<String, IRI> resolvedIRIs=
> CacheFactory./createCache/(getter, /CacheSize/) ;
>
> which is not thread-safe. So *public*IRI resolveSilent(String
> relURI)could not be used from multiple threads.
>
> @Andy: are you saying that access to resolveSilent is definitely
> single-threaded, even if there are multiple concurrent reads?
.resolveSilent is an object method - it is not called from separate
threads without an outer lock.
Each parser run creates a Prologue object which includes a prefix map
and a resolver, and the also the bNode label mapping.
It's is only for the life time of the RIOT parser (not the jena reader)
and a new one if created for each parse run so they are are not
multithread - they have a handle on the input stream to be parsed for
example. There is no need to lock.
There is also a static instance of IRIResolverNormal inside IRIResolver
(there is no route to getting hold of the object directly) and every
static method that accesses it is synchronized. It is created in class
initialization.
e.g.
static private IRI resolveIRI(String relStr, String baseStr)
{
synchronized(globalResolverLock)
{
IRI i = iriFactory.create(relStr);
if (i.isAbsolute())
// removes excess . segments
return globalResolver.getBaseIRI().create(i);
IRI base = iriFactory.create(baseStr);
if ("file".equalsIgnoreCase(base.getScheme()))
return globalResolver.getBaseIRI().create(i);
return base.create(i);
}
}
IRIResolverNormal is a static inner class with resolvedIRIs as a field.
A new cache is created for each new instance of IRIResolverNormal, and
all calls into IRIResolverNormal are either via IRIResolver.create or
globalResolver in IRIResolver.
So either IRIResolverNormal is used by a parser (new instance) or it's
used by a synchronized static. There is no need to use a concurent hash
map for resolvedIRIs.
If this is not the case, please could you provide point to the code
where it is not so.
Andy
>
> thanks
>
> Simon
|