nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Nutch Wiki] Trivial Update of "CommandLineOptions" by LewisJohnMcgibbney
Date Fri, 11 Jan 2013 01:55:16 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change notification.

The "CommandLineOptions" page has been changed by LewisJohnMcgibbney:
http://wiki.apache.org/nutch/CommandLineOptions?action=diff&rev1=44&rev2=45

  
  See each entry for details of the command arguments and options.
  
- ||'''command'''||'''function'''||version||
+ ||'''command'''||'''function'''||'''version'''||
  || || || '''1.x''' || '''2.x''' ||
- ||[[bin/nutch_crawl]]||One-step crawler for intranets||
+ ||[[bin/nutch_crawl]]||One-step crawler for intranets|| X ||X||
- ||[[bin/nutch_readdb]]||Read / dump crawl db||
+ ||[[bin/nutch_readdb]]||Read / dump crawl db|| X ||X||
- ||[[bin/nutch mergedb]]||Merge crawldb-s, with optional filtering||
+ ||[[bin/nutch mergedb]]||Merge crawldb-s, with optional filtering|| X ||||
- ||[[bin/nutch readlinkdb]]||Read / dump link db||
+ ||[[bin/nutch readlinkdb]]||Read / dump link db|| X ||||
- ||[[bin/nutch_inject]]||Inject new urls into the database||
+ ||[[bin/nutch_inject]]||Inject new urls into the database|| X ||X||
+ ||[[bin/nutch_hostinject]]||Inject new urls into the hostdatabase||  ||X||
- ||[[bin/nutch_generate]]||Generate new segments to fetch from crawldb||
+ ||[[bin/nutch_generate]]||Generate new segments to fetch from crawldb|| X ||X||
- ||[[bin/nutch_freegen]]||Generate new segments to fetch from text files||
+ ||[[bin/nutch_freegen]]||Generate new segments to fetch from text files|| X ||||
- ||[[bin/nutch_fetch]]||Fetch a segment's pages||
+ ||[[bin/nutch_fetch]]||Fetch a segment's pages|| X ||X||
- ||[[bin/nutch_parse]]||Parse a segment's pages||
+ ||[[bin/nutch_parse]]||Parse a segment's pages|| X ||X||
- ||[[bin/nutch_readseg]]||Read / dump segment data||
+ ||[[bin/nutch_readseg]]||Read / dump segment data|| X ||||
- ||[[bin/nutch_mergesegs]]||Merges multiple segments, with optional filtering and slicing||
+ ||[[bin/nutch_mergesegs]]||Merges multiple segments, with optional filtering and slicing||
X ||||
- ||[[bin/nutch_updatedb]]||Update crawldb from segments after fetching||
+ ||[[bin/nutch_updatedb]]||Update crawldb (from segments if in 1.x) after fetching|| X ||X||
+ ||[[bin/nutch_updatehostdb]]||Update hostdb after fetching|| ||X||
- ||[[bin/nutch_invertlinks]]||Create a linkdb from parsed segments||
+ ||[[bin/nutch_invertlinks]]||Create a linkdb from parsed segments|| X ||||
- ||[[bin/nutch_mergelinkdb]]||Merge's linkdb-s, with optional filtering||
+ ||[[bin/nutch_mergelinkdb]]||Merge's linkdb-s, with optional filtering|| X ||||
+ ||[[bin/nutch elasticindex]]||Run the elastic search indexer on parsed batches|| ||X||
- ||[[bin/nutch solrindex]]||Run the solr indexer on parsed segments and linkdb||
+ ||[[bin/nutch solrindex]]||Run the solr indexer on parsed segments and linkdb|| X ||X||
- ||[[bin/nutch solrdedup]]||Removes duplicate documents from solr||
+ ||[[bin/nutch solrdedup]]||Removes duplicate documents from solr|| X ||X||
- ||[[bin/nutch solrclean]]||Removes HTTP 301 and 404 documents from solr||
+ ||[[bin/nutch solrclean]]||Removes HTTP 301 and 404 documents from solr|| X ||||
- ||[[bin/nutch parsechecker]]||Checks the parser for a given url||
+ ||[[bin/nutch parsechecker]]||Checks the parser for a given url|| X ||X||
- ||[[bin/nutch indexchecker]]||Checks the indexing filters for a given url||
+ ||[[bin/nutch indexchecker]]||Checks the indexing filters for a given url|| X ||||
- ||[[bin/nutch domainstats]]||Calculates domain statistics from crawldb||
+ ||[[bin/nutch domainstats]]||Calculates domain statistics from crawldb|| X ||||
- ||[[bin/nutch webgraph]]||Generates a web graph from existing segments||
+ ||[[bin/nutch webgraph]]||Generates a web graph from existing segments|| X ||||
- ||[[bin/nutch linkrank]]||Runs a link analysis program on the generated web graph||
+ ||[[bin/nutch linkrank]]||Runs a link analysis program on the generated web graph|| X ||||
- ||[[bin/nutch scoreupdater]]||Updates the crawldb with linkrank scores||
+ ||[[bin/nutch scoreupdater]]||Updates the crawldb with linkrank scores|| X ||||
- ||[[bin/nutch nodedumper]]||Dumps the web graph's node scores||
+ ||[[bin/nutch nodedumper]]||Dumps the web graph's node scores|| X ||||
- ||[[bin/nutch plugin]]||Loads a plugin and run one of its classes main()||
+ ||[[bin/nutch plugin]]||Loads a plugin and run one of its classes main()|| X ||X||
+ ||[[bin/nutch nutchserver]]||run a (local) Nutch server on a user defined port|| ||X||
- ||[[bin/nutch junit]]||Runs the given JUnit test||
+ ||[[bin/nutch junit]]||Runs the given JUnit test|| X ||X||
  or
- ||[[bin/nutch CLASSNAME]]||run the class named CLASSNAME||
+ ||[[bin/nutch CLASSNAME]]||run the class named CLASSNAME|| X ||X||
  
  == Webgraph classes ==
  

Mime
View raw message