lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Markus Jelsma <markus.jel...@openindex.io>
Subject RE: Apache Nutch 1.5.1 + Apache Solr 4.0
Date Thu, 08 Nov 2012 13:15:46 GMT
Hi, 

Your Nutch schema likely points to the old EnglishPorterFilter that doesn't exist anymore.
You can change that occurance to PorterStemFilterFactory, that should fix the issue. 
 
-----Original message-----
> From:Antony Steiner <ant.steiner@gmail.com>
> Sent: Thu 08-Nov-2012 14:05
> To: solr-user@lucene.apache.org
> Subject: Apache Nutch 1.5.1 + Apache Solr 4.0
> 
> Hello my name is Antony and I'm new to apache nutch and solr.
> 
> I want to crawl my website and therefore I downloaded nutch to do this.
> This works fine. But no I would like to integrate nutch with solr. Im
> running this on my unix system.
> Im trying to follow this tutorial:
> http://wiki.apache.org/nutch/NutchTutorial
> But it wont for me. Running Solr without nutch is no problem. I can post
> documents to solr with post.jar. But what I want to do is post my nutch
> crawl to solr.
> Now if I copy the schema.xml from nutch to
> apache-solr-4.0.0/example/solr/collection1/conf directory aned restart solr
> (java -jar start.jar), I get compiling errors but Solr will start. (Is this
> the correct directory to copy my schema?)
> 
> Nov 8, 2012 9:40:33 AM org.apache.solr.schema.IndexSchema readSchema
> INFO: Schema name=nutch
> Nov 8, 2012 9:40:33 AM org.apache.solr.core.CoreContainer create
> SEVERE: Unable to create core: collection1
> org.apache.solr.common.SolrException: Schema Parsing Failed: multiple points
>         at
> org.apache.solr.schema.IndexSchema.readSchema(IndexSchema.java:571)
>         at org.apache.solr.schema.IndexSchema.<init>(IndexSchema.java:113)
> ...
> 
> Nov 8, 2012 9:40:33 AM org.apache.solr.common.SolrException log
> SEVERE: null:org.apache.solr.common.SolrException: Schema Parsing Failed:
> multiple points
>         at
> org.apache.solr.schema.IndexSchema.readSchema(IndexSchema.java:571)
>         at org.apache.solr.schema.IndexSchema.<init>(IndexSchema.java:113)
>         at org.apache.solr.core.CoreContainer.create(CoreContainer.java:846)
> ...
> 
> Now if I don't copy the schema and push my nutch crawl to solr I get
> following error:
> 
> SolrIndexer: starting at 2012-11-08 10:49:02
> Indexing 5 documents
> java.io.IOException: Job failed!
> SolrDeleteDuplicates: starting at 2012-11-08 10:49:47
> SolrDeleteDuplicates: Solr url: http://photon:8983/solr/
> 
> And this is taken from the logging:
> org.apache.solr.common.SolrException: ERROR: [doc=
> http://e-docs/infrastructure/cpuload_monitor.html] unknown field 'host'
> 
> What should I do or what am I missing?
> 
> I hope you can help me
> Best Regards
> Antony
> 

Mime
View raw message