nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrzej Bialecki>
Subject Re: no static NutchConf
Date Wed, 04 Jan 2006 19:10:25 GMT
Jérôme Charron wrote:

>>>Excuse me in advance, I probably missed something, but what are the use
>>>cases for having many NutchConf instances with different values?
>>Running many different tasks in parallel, each using different config,
>>inside the same JVM.
>Ok, I understand this Andrzej, but it is not really what I call a use case.
>It is more a feature that you describe here.
>In fact, what I mean is that I don't understand in which cases it will be
>usefull. And I don't understand how a particular
>NutchConfig will be selected for a particular task...

Use case: executing multiple tasks on any single tasktracker node, but 
with drastically different configurations per each task.

Example: what happens now if you try to run more than one fetcher at the 
same time, where the fetcher parameters differ (or a set of activated 
plugins differs)? You can't - the local tasks on each tasktracker will 
use whatever local config is there. What happens if you change the 
config on a node that  submits the job? The changes won't be propagated 
to the tasktracker nodes, because tasktrackers use local configuration 
(through a singleton NutchConf.get()), instead of supplying a 
serialized/deserialized instance of the config from the originating 
node... etc.

NutchConf instances will be created when you create a JobConf. Then they 
will have to be serialized/deserialized when job descriptors are sent by 
jobtracker to tasktrackers on mapred nodes, and used locally by 
tasktrackers to instantiate local tasks using copies of the original 
NutchConf instance.

Best regards,
Andrzej Bialecki     <><
 ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration  Contact: info at sigram dot com

View raw message