nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Leon Misakyan (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (NUTCH-2253) ProtocolFactory still not thread-safe
Date Wed, 20 Apr 2016 15:55:25 GMT

     [ https://issues.apache.org/jira/browse/NUTCH-2253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Leon Misakyan updated NUTCH-2253:
---------------------------------
    Description: 
Hi, as I can see in 1.11 release ProtocolFactory clas still has an issue in getProtocol method.
This is because every fetcher thread has its own ProtocolFactory instance (this.protocolFactory
= new ProtocolFactory(conf); in FetcherThread constructor.)
So have this method synchronized is useless, because each thread has its own monitor.
In our project we have issue of having multiple Protocol instances.
Issue can be fixed if getProtocol method will use shared conf instance as lock object or by
having one ProtocolFactory for all fetcher threads. 


  was:
The method getProtocol() should be synchronized otherwise the Fetcher threads can access it
around the same time and query the cache before it's had a chance of being populated properly.
This would happen for a handful of calls until the subsequent ones get the cache but this
should be fixed nonetheless e.g. when we want a guarantee that the same Protocol instance
will be called for the same fetching session.
The other Factor classes which use the same cache mechanism would suffer from the same problem.
   


> ProtocolFactory still not thread-safe
> -------------------------------------
>
>                 Key: NUTCH-2253
>                 URL: https://issues.apache.org/jira/browse/NUTCH-2253
>             Project: Nutch
>          Issue Type: Bug
>          Components: fetcher
>    Affects Versions: 1.10, 1.11
>            Reporter: Leon Misakyan
>             Fix For: 2.3, 1.8
>
>
> Hi, as I can see in 1.11 release ProtocolFactory clas still has an issue in getProtocol
method. This is because every fetcher thread has its own ProtocolFactory instance (this.protocolFactory
= new ProtocolFactory(conf); in FetcherThread constructor.)
> So have this method synchronized is useless, because each thread has its own monitor.
> In our project we have issue of having multiple Protocol instances.
> Issue can be fixed if getProtocol method will use shared conf instance as lock object
or by having one ProtocolFactory for all fetcher threads. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message