nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Nutch Wiki] Update of "Incremental Crawling Scripts Test" by Gabriele Kahlout
Date Sun, 27 Mar 2011 13:55:49 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change notification.

The "Incremental Crawling Scripts Test" page has been changed by Gabriele Kahlout.
http://wiki.apache.org/nutch/Incremental%20Crawling%20Scripts%20Test?action=diff&rev1=2&rev2=3

--------------------------------------------------

+ 1. Abridged script using Solr
+ {{{
+ ./whole-web-crawling-incremental seeds 10 1
+ rm: seeds/it_seeds/urls: No such file or directory
+ Injector: starting at 2011-03-27 15:46:15
+ Injector: crawlDb: crawl/crawldb
+ Injector: urlDir: seeds/it_seeds
+ Injector: Converting injected urls to crawl db entries.
+ Injector: Merging injected urls into crawl db.
+ Injector: finished at 2011-03-27 15:46:31, elapsed: 00:00:15
+ Fetcher: starting at 2011-03-27 15:46:59
+ Fetcher: segment: crawl/segments/20110327154649
+ Fetcher: threads: 10
+ QueueFeeder finished: total 10 records + hit by time limit :0
+ fetching http://simple.wikipedia.org/wiki/%C2%A3sd
+ -activeThreads=10, spinWaiting=9, fetchQueues.totalSize=9
+ -activeThreads=10, spinWaiting=9, fetchQueues.totalSize=9
+ -activeThreads=10, spinWaiting=9, fetchQueues.totalSize=9
+ -activeThreads=10, spinWaiting=10, fetchQueues.totalSize=9
+ -activeThreads=10, spinWaiting=10, fetchQueues.totalSize=9
+ -activeThreads=10, spinWaiting=10, fetchQueues.totalSize=9
+ fetching http://simple.wikipedia.org/wiki/%2B44
+ -activeThreads=10, spinWaiting=9, fetchQueues.totalSize=8
+ -activeThreads=10, spinWaiting=10, fetchQueues.totalSize=8
+ -activeThreads=10, spinWaiting=10, fetchQueues.totalSize=8
+ -activeThreads=10, spinWaiting=10, fetchQueues.totalSize=8
+ -activeThreads=10, spinWaiting=10, fetchQueues.totalSize=8
+ -activeThreads=10, spinWaiting=10, fetchQueues.totalSize=8
+ fetching http://simple.wikipedia.org/wiki/%28What%27s_the_Story%29_Morning_Glory%3F
+ -activeThreads=10, spinWaiting=9, fetchQueues.totalSize=7
+ -activeThreads=10, spinWaiting=10, fetchQueues.totalSize=7
+ -activeThreads=10, spinWaiting=10, fetchQueues.totalSize=7
+ -activeThreads=10, spinWaiting=10, fetchQueues.totalSize=7
+ -activeThreads=10, spinWaiting=10, fetchQueues.totalSize=7
+ fetching http://simple.wikipedia.org/wiki/%C3%81lvaro_Mej%C3%ADa_P%C3%A9rez
+ -activeThreads=10, spinWaiting=9, fetchQueues.totalSize=6
+ -activeThreads=10, spinWaiting=10, fetchQueues.totalSize=6
+ -activeThreads=10, spinWaiting=10, fetchQueues.totalSize=6
+ -activeThreads=10, spinWaiting=10, fetchQueues.totalSize=6
+ -activeThreads=10, spinWaiting=10, fetchQueues.totalSize=6
+ fetching http://simple.wikipedia.org/wiki/%C3%81lvaro_Lopes_Can%C3%A7ado
+ -activeThreads=10, spinWaiting=9, fetchQueues.totalSize=5
+ -activeThreads=10, spinWaiting=10, fetchQueues.totalSize=5
+ -activeThreads=10, spinWaiting=10, fetchQueues.totalSize=5
+ -activeThreads=10, spinWaiting=10, fetchQueues.totalSize=5
+ -activeThreads=10, spinWaiting=10, fetchQueues.totalSize=5
+ -activeThreads=10, spinWaiting=10, fetchQueues.totalSize=5
+ fetching http://simple.wikipedia.org/wiki/%2703_Bonnie_&_Clyde
+ -activeThreads=10, spinWaiting=9, fetchQueues.totalSize=4
+ * queue: http://simple.wikipedia.org
+   maxThreads    = 1
+   inProgress    = 1
+   crawlDelay    = 5000
+   minCrawlDelay = 0
+   nextFetchTime = 1301233656322
+   now           = 1301233656859
+   0. http://simple.wikipedia.org/wiki/%C3%81lvaro_Arbeloa
+   1. http://simple.wikipedia.org/wiki/%27s-Hertogenbosch
+   2. http://simple.wikipedia.org/wiki/%60Abdu%27l-Bah%C3%A1
+   3. http://simple.wikipedia.org/wiki/%27N_Sync
+ -activeThreads=10, spinWaiting=10, fetchQueues.totalSize=4
+ * queue: http://simple.wikipedia.org
+   maxThreads    = 1
+   inProgress    = 0
+   crawlDelay    = 5000
+   minCrawlDelay = 0
+   nextFetchTime = 1301233662094
+   now           = 1301233657867
+   0. http://simple.wikipedia.org/wiki/%C3%81lvaro_Arbeloa
+   1. http://simple.wikipedia.org/wiki/%27s-Hertogenbosch
+   2. http://simple.wikipedia.org/wiki/%60Abdu%27l-Bah%C3%A1
+   3. http://simple.wikipedia.org/wiki/%27N_Sync
+ -activeThreads=10, spinWaiting=10, fetchQueues.totalSize=4
+ * queue: http://simple.wikipedia.org
+   maxThreads    = 1
+   inProgress    = 0
+   crawlDelay    = 5000
+   minCrawlDelay = 0
+   nextFetchTime = 1301233662094
+   now           = 1301233658939
+   0. http://simple.wikipedia.org/wiki/%C3%81lvaro_Arbeloa
+   1. http://simple.wikipedia.org/wiki/%27s-Hertogenbosch
+   2. http://simple.wikipedia.org/wiki/%60Abdu%27l-Bah%C3%A1
+   3. http://simple.wikipedia.org/wiki/%27N_Sync
+ -activeThreads=10, spinWaiting=10, fetchQueues.totalSize=4
+ * queue: http://simple.wikipedia.org
+   maxThreads    = 1
+   inProgress    = 0
+   crawlDelay    = 5000
+   minCrawlDelay = 0
+   nextFetchTime = 1301233662094
+   now           = 1301233660020
+   0. http://simple.wikipedia.org/wiki/%C3%81lvaro_Arbeloa
+   1. http://simple.wikipedia.org/wiki/%27s-Hertogenbosch
+   2. http://simple.wikipedia.org/wiki/%60Abdu%27l-Bah%C3%A1
+   3. http://simple.wikipedia.org/wiki/%27N_Sync
+ -activeThreads=10, spinWaiting=10, fetchQueues.totalSize=4
+ * queue: http://simple.wikipedia.org
+   maxThreads    = 1
+   inProgress    = 0
+   crawlDelay    = 5000
+   minCrawlDelay = 0
+   nextFetchTime = 1301233662094
+   now           = 1301233661025
+   0. http://simple.wikipedia.org/wiki/%C3%81lvaro_Arbeloa
+   1. http://simple.wikipedia.org/wiki/%27s-Hertogenbosch
+   2. http://simple.wikipedia.org/wiki/%60Abdu%27l-Bah%C3%A1
+   3. http://simple.wikipedia.org/wiki/%27N_Sync
+ -activeThreads=10, spinWaiting=10, fetchQueues.totalSize=4
+ * queue: http://simple.wikipedia.org
+   maxThreads    = 1
+   inProgress    = 0
+   crawlDelay    = 5000
+   minCrawlDelay = 0
+   nextFetchTime = 1301233662094
+   now           = 1301233662032
+   0. http://simple.wikipedia.org/wiki/%C3%81lvaro_Arbeloa
+   1. http://simple.wikipedia.org/wiki/%27s-Hertogenbosch
+   2. http://simple.wikipedia.org/wiki/%60Abdu%27l-Bah%C3%A1
+   3. http://simple.wikipedia.org/wiki/%27N_Sync
+ fetching http://simple.wikipedia.org/wiki/%C3%81lvaro_Arbeloa
+ -activeThreads=10, spinWaiting=9, fetchQueues.totalSize=3
+ * queue: http://simple.wikipedia.org
+   maxThreads    = 1
+   inProgress    = 0
+   crawlDelay    = 5000
+   minCrawlDelay = 0
+   nextFetchTime = 1301233667900
+   now           = 1301233663039
+   0. http://simple.wikipedia.org/wiki/%27s-Hertogenbosch
+   1. http://simple.wikipedia.org/wiki/%60Abdu%27l-Bah%C3%A1
+   2. http://simple.wikipedia.org/wiki/%27N_Sync
+ -activeThreads=10, spinWaiting=10, fetchQueues.totalSize=3
+ * queue: http://simple.wikipedia.org
+   maxThreads    = 1
+   inProgress    = 0
+   crawlDelay    = 5000
+   minCrawlDelay = 0
+   nextFetchTime = 1301233667900
+   now           = 1301233664285
+   0. http://simple.wikipedia.org/wiki/%27s-Hertogenbosch
+   1. http://simple.wikipedia.org/wiki/%60Abdu%27l-Bah%C3%A1
+   2. http://simple.wikipedia.org/wiki/%27N_Sync
+ -activeThreads=10, spinWaiting=10, fetchQueues.totalSize=3
+ * queue: http://simple.wikipedia.org
+   maxThreads    = 1
+   inProgress    = 0
+   crawlDelay    = 5000
+   minCrawlDelay = 0
+   nextFetchTime = 1301233667900
+   now           = 1301233665409
+   0. http://simple.wikipedia.org/wiki/%27s-Hertogenbosch
+   1. http://simple.wikipedia.org/wiki/%60Abdu%27l-Bah%C3%A1
+   2. http://simple.wikipedia.org/wiki/%27N_Sync
+ -activeThreads=10, spinWaiting=10, fetchQueues.totalSize=3
+ * queue: http://simple.wikipedia.org
+   maxThreads    = 1
+   inProgress    = 0
+   crawlDelay    = 5000
+   minCrawlDelay = 0
+   nextFetchTime = 1301233667900
+   now           = 1301233666415
+   0. http://simple.wikipedia.org/wiki/%27s-Hertogenbosch
+   1. http://simple.wikipedia.org/wiki/%60Abdu%27l-Bah%C3%A1
+   2. http://simple.wikipedia.org/wiki/%27N_Sync
+ -activeThreads=10, spinWaiting=10, fetchQueues.totalSize=3
+ * queue: http://simple.wikipedia.org
+   maxThreads    = 1
+   inProgress    = 0
+   crawlDelay    = 5000
+   minCrawlDelay = 0
+   nextFetchTime = 1301233667900
+   now           = 1301233667516
+   0. http://simple.wikipedia.org/wiki/%27s-Hertogenbosch
+   1. http://simple.wikipedia.org/wiki/%60Abdu%27l-Bah%C3%A1
+   2. http://simple.wikipedia.org/wiki/%27N_Sync
+ fetching http://simple.wikipedia.org/wiki/%27s-Hertogenbosch
+ -activeThreads=10, spinWaiting=9, fetchQueues.totalSize=2
+ * queue: http://simple.wikipedia.org
+   maxThreads    = 1
+   inProgress    = 0
+   crawlDelay    = 5000
+   minCrawlDelay = 0
+   nextFetchTime = 1301233673363
+   now           = 1301233668525
+   0. http://simple.wikipedia.org/wiki/%60Abdu%27l-Bah%C3%A1
+   1. http://simple.wikipedia.org/wiki/%27N_Sync
+ -activeThreads=10, spinWaiting=10, fetchQueues.totalSize=2
+ * queue: http://simple.wikipedia.org
+   maxThreads    = 1
+   inProgress    = 0
+   crawlDelay    = 5000
+   minCrawlDelay = 0
+   nextFetchTime = 1301233673363
+   now           = 1301233669647
+   0. http://simple.wikipedia.org/wiki/%60Abdu%27l-Bah%C3%A1
+   1. http://simple.wikipedia.org/wiki/%27N_Sync
+ -activeThreads=10, spinWaiting=10, fetchQueues.totalSize=2
+ * queue: http://simple.wikipedia.org
+   maxThreads    = 1
+   inProgress    = 0
+   crawlDelay    = 5000
+   minCrawlDelay = 0
+   nextFetchTime = 1301233673363
+   now           = 1301233670783
+   0. http://simple.wikipedia.org/wiki/%60Abdu%27l-Bah%C3%A1
+   1. http://simple.wikipedia.org/wiki/%27N_Sync
+ -activeThreads=10, spinWaiting=10, fetchQueues.totalSize=2
+ * queue: http://simple.wikipedia.org
+   maxThreads    = 1
+   inProgress    = 0
+   crawlDelay    = 5000
+   minCrawlDelay = 0
+   nextFetchTime = 1301233673363
+   now           = 1301233671791
+   0. http://simple.wikipedia.org/wiki/%60Abdu%27l-Bah%C3%A1
+   1. http://simple.wikipedia.org/wiki/%27N_Sync
+ -activeThreads=10, spinWaiting=10, fetchQueues.totalSize=2
+ * queue: http://simple.wikipedia.org
+   maxThreads    = 1
+   inProgress    = 0
+   crawlDelay    = 5000
+   minCrawlDelay = 0
+   nextFetchTime = 1301233673363
+   now           = 1301233672903
+   0. http://simple.wikipedia.org/wiki/%60Abdu%27l-Bah%C3%A1
+   1. http://simple.wikipedia.org/wiki/%27N_Sync
+ fetching http://simple.wikipedia.org/wiki/%60Abdu%27l-Bah%C3%A1
+ -activeThreads=10, spinWaiting=9, fetchQueues.totalSize=1
+ * queue: http://simple.wikipedia.org
+   maxThreads    = 1
+   inProgress    = 1
+   crawlDelay    = 5000
+   minCrawlDelay = 0
+   nextFetchTime = 1301233673363
+   now           = 1301233673908
+   0. http://simple.wikipedia.org/wiki/%27N_Sync
+ -activeThreads=10, spinWaiting=10, fetchQueues.totalSize=1
+ * queue: http://simple.wikipedia.org
+   maxThreads    = 1
+   inProgress    = 0
+   crawlDelay    = 5000
+   minCrawlDelay = 0
+   nextFetchTime = 1301233678937
+   now           = 1301233674914
+   0. http://simple.wikipedia.org/wiki/%27N_Sync
+ -activeThreads=10, spinWaiting=10, fetchQueues.totalSize=1
+ * queue: http://simple.wikipedia.org
+   maxThreads    = 1
+   inProgress    = 0
+   crawlDelay    = 5000
+   minCrawlDelay = 0
+   nextFetchTime = 1301233678937
+   now           = 1301233675919
+   0. http://simple.wikipedia.org/wiki/%27N_Sync
+ -activeThreads=10, spinWaiting=10, fetchQueues.totalSize=1
+ * queue: http://simple.wikipedia.org
+   maxThreads    = 1
+   inProgress    = 0
+   crawlDelay    = 5000
+   minCrawlDelay = 0
+   nextFetchTime = 1301233678937
+   now           = 1301233676925
+   0. http://simple.wikipedia.org/wiki/%27N_Sync
+ -activeThreads=10, spinWaiting=10, fetchQueues.totalSize=1
+ * queue: http://simple.wikipedia.org
+   maxThreads    = 1
+   inProgress    = 0
+   crawlDelay    = 5000
+   minCrawlDelay = 0
+   nextFetchTime = 1301233678937
+   now           = 1301233677930
+   0. http://simple.wikipedia.org/wiki/%27N_Sync
+ -activeThreads=10, spinWaiting=10, fetchQueues.totalSize=1
+ * queue: http://simple.wikipedia.org
+   maxThreads    = 1
+   inProgress    = 0
+   crawlDelay    = 5000
+   minCrawlDelay = 0
+   nextFetchTime = 1301233678937
+   now           = 1301233679037
+   0. http://simple.wikipedia.org/wiki/%27N_Sync
+ fetching http://simple.wikipedia.org/wiki/%27N_Sync
+ -finishing thread FetcherThread, activeThreads=9
+ -finishing thread FetcherThread, activeThreads=8
+ -finishing thread FetcherThread, activeThreads=7
+ -finishing thread FetcherThread, activeThreads=6
+ -finishing thread FetcherThread, activeThreads=5
+ -finishing thread FetcherThread, activeThreads=4
+ -finishing thread FetcherThread, activeThreads=3
+ -finishing thread FetcherThread, activeThreads=2
+ -finishing thread FetcherThread, activeThreads=1
+ -finishing thread FetcherThread, activeThreads=0
+ -activeThreads=0, spinWaiting=0, fetchQueues.totalSize=0
+ -activeThreads=0
+ Fetcher: finished at 2011-03-27 15:48:04, elapsed: 00:01:04
+ CrawlDb update: starting at 2011-03-27 15:48:09
+ CrawlDb update: db: crawl/crawldb
+ CrawlDb update: segments: [crawl/segments/20110327154649]
+ CrawlDb update: additions allowed: true
+ CrawlDb update: URL normalizing: false
+ CrawlDb update: URL filtering: false
+ CrawlDb update: Merging segment data into db.
+ CrawlDb update: finished at 2011-03-27 15:48:19, elapsed: 00:00:09
+ LinkDb: starting at 2011-03-27 15:48:24
+ LinkDb: linkdb: crawl/linkdb
+ LinkDb: URL normalize: true
+ LinkDb: URL filter: true
+ LinkDb: adding segment: file:/Users/simpatico/nutch-1.2/crawl/segments/20110327154649
+ LinkDb: finished at 2011-03-27 15:48:32, elapsed: 00:00:07
+ SolrIndexer: starting at 2011-03-27 15:48:36
+ SolrIndexer: finished at 2011-03-27 15:48:54, elapsed: 00:00:17
+ Injector: starting at 2011-03-27 15:48:58
+ Injector: crawlDb: crawl/crawldb
+ Injector: urlDir: seeds/it_seeds
+ Injector: Converting injected urls to crawl db entries.
+ Injector: Merging injected urls into crawl db.
+ Injector: finished at 2011-03-27 15:49:15, elapsed: 00:00:16
+ Fetcher: starting at 2011-03-27 15:49:42
+ Fetcher: segment: crawl/segments/20110327154933
+ Fetcher: threads: 10
+ QueueFeeder finished: total 10 records + hit by time limit :0
+ fetching http://simple.wikipedia.org/wiki/%C3%81ngel_S%C3%A1nchez_%28baseball%29
+ -activeThreads=10, spinWaiting=9, fetchQueues.totalSize=9
+ -activeThreads=10, spinWaiting=9, fetchQueues.totalSize=9
+ -activeThreads=10, spinWaiting=10, fetchQueues.totalSize=9
+ -activeThreads=10, spinWaiting=10, fetchQueues.totalSize=9
+ -activeThreads=10, spinWaiting=10, fetchQueues.totalSize=9
+ -activeThreads=10, spinWaiting=10, fetchQueues.totalSize=9
+ fetching http://simple.wikipedia.org/wiki/%C3%81ngel_Javier_Arizmendi
+ -activeThreads=10, spinWaiting=9, fetchQueues.totalSize=8
+ -activeThreads=10, spinWaiting=10, fetchQueues.totalSize=8
+ -activeThreads=10, spinWaiting=10, fetchQueues.totalSize=8
+ -activeThreads=10, spinWaiting=10, fetchQueues.totalSize=8
+ -activeThreads=10, spinWaiting=10, fetchQueues.totalSize=8
+ -activeThreads=10, spinWaiting=10, fetchQueues.totalSize=8
+ fetching http://simple.wikipedia.org/wiki/%C3%81o_d%C3%A0i
+ -activeThreads=10, spinWaiting=10, fetchQueues.totalSize=7
+ -activeThreads=10, spinWaiting=10, fetchQueues.totalSize=7
+ -activeThreads=10, spinWaiting=10, fetchQueues.totalSize=7
+ -activeThreads=10, spinWaiting=10, fetchQueues.totalSize=7
+ -activeThreads=10, spinWaiting=10, fetchQueues.totalSize=7
+ fetching http://simple.wikipedia.org/wiki/%C3%82nderson_Polga
+ -activeThreads=10, spinWaiting=9, fetchQueues.totalSize=6
+ -activeThreads=10, spinWaiting=10, fetchQueues.totalSize=6
+ -activeThreads=10, spinWaiting=10, fetchQueues.totalSize=6
+ -activeThreads=10, spinWaiting=10, fetchQueues.totalSize=6
+ -activeThreads=10, spinWaiting=10, fetchQueues.totalSize=6
+ fetching http://simple.wikipedia.org/wiki/%C3%81lvaro_Recoba
+ -activeThreads=10, spinWaiting=9, fetchQueues.totalSize=5
+ -activeThreads=10, spinWaiting=10, fetchQueues.totalSize=5
+ -activeThreads=10, spinWaiting=10, fetchQueues.totalSize=5
+ -activeThreads=10, spinWaiting=10, fetchQueues.totalSize=5
+ -activeThreads=10, spinWaiting=10, fetchQueues.totalSize=5
+ -activeThreads=10, spinWaiting=10, fetchQueues.totalSize=5
+ fetching http://simple.wikipedia.org/wiki/%C3%81lvaro_Sabor%C3%ADo
+ -activeThreads=10, spinWaiting=9, fetchQueues.totalSize=4
+ * queue: http://simple.wikipedia.org
+   maxThreads    = 1
+   inProgress    = 1
+   crawlDelay    = 5000
+   minCrawlDelay = 0
+   nextFetchTime = 1301233819220
+   now           = 1301233819888
+   0. http://simple.wikipedia.org/wiki/%C3%81ttila_de_Carvalho
+   1. http://simple.wikipedia.org/wiki/%C3%81stor_Piazzolla
+   2. http://simple.wikipedia.org/wiki/%C3%82nderson_Lima_Veiga
+   3. http://simple.wikipedia.org/wiki/%C3%81ngel_de_Saavedra,_Duke_of_Rivas
+ -activeThreads=10, spinWaiting=10, fetchQueues.totalSize=4
+ * queue: http://simple.wikipedia.org
+   maxThreads    = 1
+   inProgress    = 0
+   crawlDelay    = 5000
+   minCrawlDelay = 0
+   nextFetchTime = 1301233825026
+   now           = 1301233820895
+   0. http://simple.wikipedia.org/wiki/%C3%81ttila_de_Carvalho
+   1. http://simple.wikipedia.org/wiki/%C3%81stor_Piazzolla
+   2. http://simple.wikipedia.org/wiki/%C3%82nderson_Lima_Veiga
+   3. http://simple.wikipedia.org/wiki/%C3%81ngel_de_Saavedra,_Duke_of_Rivas
+ -activeThreads=10, spinWaiting=10, fetchQueues.totalSize=4
+ * queue: http://simple.wikipedia.org
+   maxThreads    = 1
+   inProgress    = 0
+   crawlDelay    = 5000
+   minCrawlDelay = 0
+   nextFetchTime = 1301233825026
+   now           = 1301233821902
+   0. http://simple.wikipedia.org/wiki/%C3%81ttila_de_Carvalho
+   1. http://simple.wikipedia.org/wiki/%C3%81stor_Piazzolla
+   2. http://simple.wikipedia.org/wiki/%C3%82nderson_Lima_Veiga
+   3. http://simple.wikipedia.org/wiki/%C3%81ngel_de_Saavedra,_Duke_of_Rivas
+ -activeThreads=10, spinWaiting=10, fetchQueues.totalSize=4
+ * queue: http://simple.wikipedia.org
+   maxThreads    = 1
+   inProgress    = 0
+   crawlDelay    = 5000
+   minCrawlDelay = 0
+   nextFetchTime = 1301233825026
+   now           = 1301233823027
+   0. http://simple.wikipedia.org/wiki/%C3%81ttila_de_Carvalho
+   1. http://simple.wikipedia.org/wiki/%C3%81stor_Piazzolla
+   2. http://simple.wikipedia.org/wiki/%C3%82nderson_Lima_Veiga
+   3. http://simple.wikipedia.org/wiki/%C3%81ngel_de_Saavedra,_Duke_of_Rivas
+ -activeThreads=10, spinWaiting=10, fetchQueues.totalSize=4
+ * queue: http://simple.wikipedia.org
+   maxThreads    = 1
+   inProgress    = 0
+   crawlDelay    = 5000
+   minCrawlDelay = 0
+   nextFetchTime = 1301233825026
+   now           = 1301233824032
+   0. http://simple.wikipedia.org/wiki/%C3%81ttila_de_Carvalho
+   1. http://simple.wikipedia.org/wiki/%C3%81stor_Piazzolla
+   2. http://simple.wikipedia.org/wiki/%C3%82nderson_Lima_Veiga
+   3. http://simple.wikipedia.org/wiki/%C3%81ngel_de_Saavedra,_Duke_of_Rivas
+ -activeThreads=10, spinWaiting=10, fetchQueues.totalSize=4
+ * queue: http://simple.wikipedia.org
+   maxThreads    = 1
+   inProgress    = 0
+   crawlDelay    = 5000
+   minCrawlDelay = 0
+   nextFetchTime = 1301233825026
+   now           = 1301233825039
+   0. http://simple.wikipedia.org/wiki/%C3%81ttila_de_Carvalho
+   1. http://simple.wikipedia.org/wiki/%C3%81stor_Piazzolla
+   2. http://simple.wikipedia.org/wiki/%C3%82nderson_Lima_Veiga
+   3. http://simple.wikipedia.org/wiki/%C3%81ngel_de_Saavedra,_Duke_of_Rivas
+ fetching http://simple.wikipedia.org/wiki/%C3%81ttila_de_Carvalho
+ -activeThreads=10, spinWaiting=10, fetchQueues.totalSize=3
+ * queue: http://simple.wikipedia.org
+   maxThreads    = 1
+   inProgress    = 0
+   crawlDelay    = 5000
+   minCrawlDelay = 0
+   nextFetchTime = 1301233830697
+   now           = 1301233826047
+   0. http://simple.wikipedia.org/wiki/%C3%81stor_Piazzolla
+   1. http://simple.wikipedia.org/wiki/%C3%82nderson_Lima_Veiga
+   2. http://simple.wikipedia.org/wiki/%C3%81ngel_de_Saavedra,_Duke_of_Rivas
+ -activeThreads=10, spinWaiting=10, fetchQueues.totalSize=3
+ * queue: http://simple.wikipedia.org
+   maxThreads    = 1
+   inProgress    = 0
+   crawlDelay    = 5000
+   minCrawlDelay = 0
+   nextFetchTime = 1301233830697
+   now           = 1301233827053
+   0. http://simple.wikipedia.org/wiki/%C3%81stor_Piazzolla
+   1. http://simple.wikipedia.org/wiki/%C3%82nderson_Lima_Veiga
+   2. http://simple.wikipedia.org/wiki/%C3%81ngel_de_Saavedra,_Duke_of_Rivas
+ -activeThreads=10, spinWaiting=10, fetchQueues.totalSize=3
+ * queue: http://simple.wikipedia.org
+   maxThreads    = 1
+   inProgress    = 0
+   crawlDelay    = 5000
+   minCrawlDelay = 0
+   nextFetchTime = 1301233830697
+   now           = 1301233828058
+   0. http://simple.wikipedia.org/wiki/%C3%81stor_Piazzolla
+   1. http://simple.wikipedia.org/wiki/%C3%82nderson_Lima_Veiga
+   2. http://simple.wikipedia.org/wiki/%C3%81ngel_de_Saavedra,_Duke_of_Rivas
+ -activeThreads=10, spinWaiting=10, fetchQueues.totalSize=3
+ * queue: http://simple.wikipedia.org
+   maxThreads    = 1
+   inProgress    = 0
+   crawlDelay    = 5000
+   minCrawlDelay = 0
+   nextFetchTime = 1301233830697
+   now           = 1301233829165
+   0. http://simple.wikipedia.org/wiki/%C3%81stor_Piazzolla
+   1. http://simple.wikipedia.org/wiki/%C3%82nderson_Lima_Veiga
+   2. http://simple.wikipedia.org/wiki/%C3%81ngel_de_Saavedra,_Duke_of_Rivas
+ -activeThreads=10, spinWaiting=10, fetchQueues.totalSize=3
+ * queue: http://simple.wikipedia.org
+   maxThreads    = 1
+   inProgress    = 0
+   crawlDelay    = 5000
+   minCrawlDelay = 0
+   nextFetchTime = 1301233830697
+   now           = 1301233830170
+   0. http://simple.wikipedia.org/wiki/%C3%81stor_Piazzolla
+   1. http://simple.wikipedia.org/wiki/%C3%82nderson_Lima_Veiga
+   2. http://simple.wikipedia.org/wiki/%C3%81ngel_de_Saavedra,_Duke_of_Rivas
+ fetching http://simple.wikipedia.org/wiki/%C3%81stor_Piazzolla
+ -activeThreads=10, spinWaiting=9, fetchQueues.totalSize=2
+ * queue: http://simple.wikipedia.org
+   maxThreads    = 1
+   inProgress    = 1
+   crawlDelay    = 5000
+   minCrawlDelay = 0
+   nextFetchTime = 1301233830697
+   now           = 1301233831176
+   0. http://simple.wikipedia.org/wiki/%C3%82nderson_Lima_Veiga
+   1. http://simple.wikipedia.org/wiki/%C3%81ngel_de_Saavedra,_Duke_of_Rivas
+ -activeThreads=10, spinWaiting=10, fetchQueues.totalSize=2
+ * queue: http://simple.wikipedia.org
+   maxThreads    = 1
+   inProgress    = 0
+   crawlDelay    = 5000
+   minCrawlDelay = 0
+   nextFetchTime = 1301233836264
+   now           = 1301233832271
+   0. http://simple.wikipedia.org/wiki/%C3%82nderson_Lima_Veiga
+   1. http://simple.wikipedia.org/wiki/%C3%81ngel_de_Saavedra,_Duke_of_Rivas
+ -activeThreads=10, spinWaiting=10, fetchQueues.totalSize=2
+ * queue: http://simple.wikipedia.org
+   maxThreads    = 1
+   inProgress    = 0
+   crawlDelay    = 5000
+   minCrawlDelay = 0
+   nextFetchTime = 1301233836264
+   now           = 1301233833402
+   0. http://simple.wikipedia.org/wiki/%C3%82nderson_Lima_Veiga
+   1. http://simple.wikipedia.org/wiki/%C3%81ngel_de_Saavedra,_Duke_of_Rivas
+ -activeThreads=10, spinWaiting=10, fetchQueues.totalSize=2
+ * queue: http://simple.wikipedia.org
+   maxThreads    = 1
+   inProgress    = 0
+   crawlDelay    = 5000
+   minCrawlDelay = 0
+   nextFetchTime = 1301233836264
+   now           = 1301233834407
+   0. http://simple.wikipedia.org/wiki/%C3%82nderson_Lima_Veiga
+   1. http://simple.wikipedia.org/wiki/%C3%81ngel_de_Saavedra,_Duke_of_Rivas
+ -activeThreads=10, spinWaiting=10, fetchQueues.totalSize=2
+ * queue: http://simple.wikipedia.org
+   maxThreads    = 1
+   inProgress    = 0
+   crawlDelay    = 5000
+   minCrawlDelay = 0
+   nextFetchTime = 1301233836264
+   now           = 1301233835414
+   0. http://simple.wikipedia.org/wiki/%C3%82nderson_Lima_Veiga
+   1. http://simple.wikipedia.org/wiki/%C3%81ngel_de_Saavedra,_Duke_of_Rivas
+ -activeThreads=10, spinWaiting=10, fetchQueues.totalSize=2
+ * queue: http://simple.wikipedia.org
+   maxThreads    = 1
+   inProgress    = 0
+   crawlDelay    = 5000
+   minCrawlDelay = 0
+   nextFetchTime = 1301233836264
+   now           = 1301233836420
+   0. http://simple.wikipedia.org/wiki/%C3%82nderson_Lima_Veiga
+   1. http://simple.wikipedia.org/wiki/%C3%81ngel_de_Saavedra,_Duke_of_Rivas
+ fetching http://simple.wikipedia.org/wiki/%C3%82nderson_Lima_Veiga
+ -activeThreads=10, spinWaiting=10, fetchQueues.totalSize=1
+ * queue: http://simple.wikipedia.org
+   maxThreads    = 1
+   inProgress    = 0
+   crawlDelay    = 5000
+   minCrawlDelay = 0
+   nextFetchTime = 1301233841867
+   now           = 1301233837520
+   0. http://simple.wikipedia.org/wiki/%C3%81ngel_de_Saavedra,_Duke_of_Rivas
+ -activeThreads=10, spinWaiting=10, fetchQueues.totalSize=1
+ * queue: http://simple.wikipedia.org
+   maxThreads    = 1
+   inProgress    = 0
+   crawlDelay    = 5000
+   minCrawlDelay = 0
+   nextFetchTime = 1301233841867
+   now           = 1301233838633
+   0. http://simple.wikipedia.org/wiki/%C3%81ngel_de_Saavedra,_Duke_of_Rivas
+ -activeThreads=10, spinWaiting=10, fetchQueues.totalSize=1
+ * queue: http://simple.wikipedia.org
+   maxThreads    = 1
+   inProgress    = 0
+   crawlDelay    = 5000
+   minCrawlDelay = 0
+   nextFetchTime = 1301233841867
+   now           = 1301233839667
+   0. http://simple.wikipedia.org/wiki/%C3%81ngel_de_Saavedra,_Duke_of_Rivas
+ -activeThreads=10, spinWaiting=10, fetchQueues.totalSize=1
+ * queue: http://simple.wikipedia.org
+   maxThreads    = 1
+   inProgress    = 0
+   crawlDelay    = 5000
+   minCrawlDelay = 0
+   nextFetchTime = 1301233841867
+   now           = 1301233840700
+   0. http://simple.wikipedia.org/wiki/%C3%81ngel_de_Saavedra,_Duke_of_Rivas
+ -activeThreads=10, spinWaiting=10, fetchQueues.totalSize=1
+ * queue: http://simple.wikipedia.org
+   maxThreads    = 1
+   inProgress    = 0
+   crawlDelay    = 5000
+   minCrawlDelay = 0
+   nextFetchTime = 1301233841867
+   now           = 1301233841923
+   0. http://simple.wikipedia.org/wiki/%C3%81ngel_de_Saavedra,_Duke_of_Rivas
+ fetching http://simple.wikipedia.org/wiki/%C3%81ngel_de_Saavedra,_Duke_of_Rivas
+ -finishing thread FetcherThread, activeThreads=9
+ -finishing thread FetcherThread, activeThreads=8
+ -finishing thread FetcherThread, activeThreads=7
+ -finishing thread FetcherThread, activeThreads=6
+ -finishing thread FetcherThread, activeThreads=5
+ -finishing thread FetcherThread, activeThreads=4
+ -finishing thread FetcherThread, activeThreads=3
+ -finishing thread FetcherThread, activeThreads=2
+ -finishing thread FetcherThread, activeThreads=1
+ -finishing thread FetcherThread, activeThreads=0
+ -activeThreads=0, spinWaiting=0, fetchQueues.totalSize=0
+ -activeThreads=0
+ Fetcher: finished at 2011-03-27 15:50:47, elapsed: 00:01:04
+ CrawlDb update: starting at 2011-03-27 15:50:52
+ CrawlDb update: db: crawl/crawldb
+ CrawlDb update: segments: [crawl/segments/20110327154933]
+ CrawlDb update: additions allowed: true
+ CrawlDb update: URL normalizing: false
+ CrawlDb update: URL filtering: false
+ CrawlDb update: Merging segment data into db.
+ CrawlDb update: finished at 2011-03-27 15:51:03, elapsed: 00:00:10
+ LinkDb: starting at 2011-03-27 15:51:08
+ LinkDb: linkdb: crawl/linkdb
+ LinkDb: URL normalize: true
+ LinkDb: URL filter: true
+ LinkDb: adding segment: file:/Users/simpatico/nutch-1.2/crawl/segments/20110327154649
+ LinkDb: adding segment: file:/Users/simpatico/nutch-1.2/crawl/segments/20110327154933
+ LinkDb: merging with existing linkdb: crawl/linkdb
+ LinkDb: finished at 2011-03-27 15:51:27, elapsed: 00:00:18
+ SolrIndexer: starting at 2011-03-27 15:51:31
+ SolrIndexer: finished at 2011-03-27 15:51:54, elapsed: 00:00:22
+ 
+ 
+ 
+ }}}
+ 
  2. Unabridged script with explanations and using nutch index:
  
  {{{

Mime
View raw message