manifoldcf-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ahmet Arslan <iori...@yahoo.com>
Subject Re: Rules of excluding specific files in Windows file server are not recognized
Date Tue, 11 Sep 2012 11:44:46 GMT
Hi Shigeki

Can you try entering "*text.txt" in the text box?

Ahmet
--- On Tue, 9/11/12, Shigeki Kobayashi <shigeki.kobayashi3@g.softbank.co.jp> wrote:

From: Shigeki Kobayashi <shigeki.kobayashi3@g.softbank.co.jp>
Subject: Rules of excluding specific files in Windows file server are not recognized
To: user@manifoldcf.apache.org
Date: Tuesday, September 11, 2012, 1:46 PM

Hi guys. 
I need some help in excluding specific files from crawling.
I am trying to crawl Windows file server using Windows shares connector to index to Solr.

There are some files I do not want to index so I set paths to exclude them from crawling,
but the job crawls them.
For example, I do NOT want to index "text.txt" in a directory D which is a root path.


In "Paths" tab: - Set D as the root path.  - To create crawling rules, from pulldown, chose
"exclude" and "file", and enter "text.txt" in a text box.

- The list of crawling rules is created as following:
  1. Exclude file(s) matching text.txt   2. Include indexable file(s) matching *  3. Include
directory(s) matching *


- Save the job setting
As the result, the job still tries to crawl the file.I wonder why "text.txt" does not match
in the crawling rule.


Anyone knows what I did wrong? 
Version:  MCF 0.5  Solr 3.5  MySql 5.5

Regards,
Shigeki



Mime
View raw message