lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Lance Norskog <goks...@gmail.com>
Subject Re: Indexing HTML files in SOLR
Date Sun, 20 Jun 2010 00:40:55 GMT
Ah! You need a SolrJ program that uses Tika to parse the files and
upload the text. I think there is such a program already but do not
know where it is.

Lance

On Thu, Jun 17, 2010 at 6:13 AM, seesiddharth <seesiddharth@yahoo.com> wrote:
>
> Thank you so much for the reply...The link suggested by you is helpful but
> they have explain everything with use of curl command which I don't want to
> use.
> I was more interested in uploading the .html documents using HTTP web
> request.
> So I have stored all .html files at one location & then created HTML parser
> which will fetch the content from these html file & build an XML string
> (like <add><doc><field
> name=""></field>.....</doc>......<doc>...</doc></add>).
Then I sent these
> XML string using HTTP web request method (in .net ) to solr server to
> add/update the document.
> Now I am able to search the data in solr of all uploaded documents.
> It will be great if u answer my question :
> Is there any better approach to achieve the same functionality ?
>
> Regards,
> Siddharth
> --
> View this message in context: http://lucene.472066.n3.nabble.com/Indexing-HTML-files-in-SOLR-tp896530p902644.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>



-- 
Lance Norskog
goksron@gmail.com

Mime
View raw message