lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From ahmad ajiloo <ahmad.aji...@gmail.com>
Subject Re: How to serach on specific file types ?
Date Tue, 13 Sep 2011 18:57:03 GMT
1- How can I put the file extension into my index? I'm using Nutch to
crawling web pages and sending Nutch's data to Solr for indexing. and I have
no idea to put the file extension to my index.
2- please give me some help links about mime type. I'm new to Solr and don't
know anything about mime type. please note that I should index data of Nutch
and I couldn't find useful commands in Nutch tutorial for advanced indexing!
thank you very much


On Mon, Sep 12, 2011 at 6:07 PM, Jaeger, Jay - DOT <Jay.Jaeger@dot.wi.gov>wrote:

> Some possibilities:
>
> 1) Put the file extension into your index (that is what we did when we were
> testing indexing documents with Solr)
> 2) Put a mime type for the document into your index.
> 3) Put the whole file name / URL into your index, and match on part of the
> name.  This will give some false positives.
>
> JRJ
>
> -----Original Message-----
> From: ahmad ajiloo [mailto:ahmad.ajiloo@gmail.com]
> Sent: Monday, September 12, 2011 5:58 AM
> To: solr-user@lucene.apache.org
> Subject: Fwd: How to serach on specific file types ?
>
> Hello
> I want to search on articles. So need to find only specific files like doc,
> docx, and pdf.
> I don't need any html pages. Thus the result of our search should only
> consists of doc, docx, and pdf files.
> can you help me?
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message