From nutch-dev-return-2234-apmail-lucene-nutch-dev-archive=lucene.apache.org@lucene.apache.org Mon Sep 26 18:34:30 2005 Return-Path: Delivered-To: apmail-lucene-nutch-dev-archive@www.apache.org Received: (qmail 34239 invoked from network); 26 Sep 2005 18:34:30 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur.apache.org with SMTP; 26 Sep 2005 18:34:30 -0000 Received: (qmail 2759 invoked by uid 500); 26 Sep 2005 18:34:28 -0000 Delivered-To: apmail-lucene-nutch-dev-archive@lucene.apache.org Received: (qmail 2741 invoked by uid 500); 26 Sep 2005 18:34:28 -0000 Mailing-List: contact nutch-dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: nutch-dev@lucene.apache.org Delivered-To: mailing list nutch-dev@lucene.apache.org Received: (qmail 2728 invoked by uid 99); 26 Sep 2005 18:34:28 -0000 Received: from asf.osuosl.org (HELO asf.osuosl.org) (140.211.166.49) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 26 Sep 2005 11:34:28 -0700 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests=HTML_MESSAGE X-Spam-Check-By: apache.org Received-SPF: pass (asf.osuosl.org: local policy) Received: from [66.194.55.244] (HELO mail2.globalspec.com) (66.194.55.244) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 26 Sep 2005 11:34:34 -0700 Received: from gshqxc01.globalspec.net ([172.16.2.52]) by mail2.globalspec.com with Microsoft SMTPSVC(6.0.3790.1830); Mon, 26 Sep 2005 14:34:00 -0400 X-MimeOLE: Produced By Microsoft Exchange V6.5.7226.0 Content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----_=_NextPart_001_01C5C2C8.A011749E" Subject: API for injecting content into Nutch? Date: Mon, 26 Sep 2005 14:32:12 -0400 Message-ID: X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: API for injecting content into Nutch? Thread-Index: AcXCyJxDuSFJPH8GTTu9oo72/K+dzQ== From: "Goldschmidt, Dave" To: X-OriginalArrivalTime: 26 Sep 2005 18:34:00.0465 (UTC) FILETIME=[DCDD1010:01C5C2C8] X-Virus-Checked: Checked by ClamAV on apache.org X-Spam-Rating: minotaur.apache.org 1.6.2 0/1000/N ------_=_NextPart_001_01C5C2C8.A011749E Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Hello, =20 Is there an API of some sort for injecting content into Nutch *without* using Nutch's crawler? Or does anyone have ideas as to how to approach this problem? I.e. given a URL, a page of content, metadata about the page, links, etc., how can I inject this into Nutch without Nutch performing the crawl? =20 Thanks in advance for your ideas and insights, =20 DaveG =20 ------_=_NextPart_001_01C5C2C8.A011749E--