nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Leon Misakyan (JIRA)" <>
Subject [jira] [Commented] (NUTCH-1604) ProtocolFactory not thread-safe
Date Tue, 19 Apr 2016 17:09:25 GMT


Leon Misakyan commented on NUTCH-1604:

Hi, as I can see in 1.11 release ProtocolFactory clas still has an issue in getProtocol method.
This is because every fetcher thread has its own ProtocolFactory instance (this.protocolFactory
= new ProtocolFactory(conf); in FetcherThread constructor.)
So have this method synchronized is useless, because each thread has its own monitor.
In our project we have issue of having multiple Protocol instances.
Issue can be fixed if getProtocol method will use shared conf instance as lock object or by
having one ProtocolFactory for all fetcher threads. 

> ProtocolFactory not thread-safe
> -------------------------------
>                 Key: NUTCH-1604
>                 URL:
>             Project: Nutch
>          Issue Type: Bug
>          Components: fetcher
>    Affects Versions: 1.7, 2.2.1
>            Reporter: Julien Nioche
>             Fix For: 2.3, 1.8
>         Attachments: NUTCH-1604.2.x.patch, NUTCH-1604.patch
> The method getProtocol() should be synchronized otherwise the Fetcher threads can access
it around the same time and query the cache before it's had a chance of being populated properly.
This would happen for a handful of calls until the subsequent ones get the cache but this
should be fixed nonetheless e.g. when we want a guarantee that the same Protocol instance
will be called for the same fetching session.
> The other Factor classes which use the same cache mechanism would suffer from the same

This message was sent by Atlassian JIRA

View raw message