nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Nutch Wiki] Update of "IndexWriters" by RoannelFernandez
Date Mon, 18 Jun 2018 20:00:17 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change notification.

The "IndexWriters" page has been changed by RoannelFernandez:
https://wiki.apache.org/nutch/IndexWriters?action=diff&rev1=9&rev2=10

Comment:
CloudSearch indexer properties

  || user || Username for auth credentials (only used when https is enabled) || user ||
  || password || Password for auth credentials (only used when https is enabled) || password
||
  || type || Default type to send documents to. || doc ||
- || https || '''true''' to enable https, '''false''' to disable https If you've disabled
http access (by forcing https), be sure to set this to true, otherwise you might get "connection
reset by peer". || false ||
+ || https || '''true''' to enable https, '''false''' to disable https. If you've disabled
http access (by forcing https), be sure to set this to true, otherwise you might get "connection
reset by peer". || false ||
  || trustallhostnames || '''true''' to trust elasticsearch server's certificate even if its
listed domain name does not match the domain they are hosted on '''false''' to check if the
elasticsearch server's certificate's listed domain is the same domain that it is hosted on,
and if it doesn't, then fail to index (only used when https is enabled) || false ||
  || languages || A list of strings denoting the supported languages (e.g. `en, de, fr, it`).
If this value is empty all documents will be sent to `index` property. If not empty the Rest
client will distribute documents in different indices based on their `languages` property.
Indices are named with the following schema: `index` `separator` `language` (e.g. `nutch_de`).
Entries with an unsupported `languages` value will be added to index `index` `separator` `sink`
(e.g. `nutch_others`). ||  ||
  || separator || Is used only if `languages` property is defined to build the index name
(i.e. `index` `separator` `lang`).  || _ ||
  || sink || Is used only if `languages` property is defined to build the index name where
to store documents with unsupported languages (i.e. `index` `separator` `sink`). || others
||
  
+ === CloudSearch indexer properties ===
+ 
+ ||'''Parameter Name''' ||'''Description''' ||'''Default value''' ||
+ || endpoint || Endpoint where service requests should be submitted. ||  ||
+ || region || Region name. ||  ||
+ || batch.dump || '''true''' to send documents to a local file. || false ||
+ || batch.maxSize || Maximum number of documents to send as a batch to CloudSearch. || -1
||
+ 

Mime
View raw message