lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Markus Jelsma <markus.jel...@openindex.io>
Subject RE: trunk is unable to replicate between nodes ( Unable to download ... completely)
Date Mon, 05 Nov 2012 10:31:08 GMT
https://issues.apache.org/jira/browse/SOLR-4032

 
 
-----Original message-----
> From:Mark Miller <markrmiller@gmail.com>
> Sent: Sat 03-Nov-2012 14:25
> To: solr-user@lucene.apache.org
> Subject: Re: trunk is unable to replicate between nodes ( Unable to download ... completely)
> 
> Likely some of the trunk work around allowing any Directory impl to replicate. JIRA pls
:)
> 
> - Mark
> 
> On Oct 30, 2012, at 12:29 PM, Markus Jelsma <markus.jelsma@openindex.io> wrote:
> 
> > Hi,
> > 
> > We're testing again with today's trunk and using the new Lucene 4.1 format by default.
When nodes are not restarted things are kind of stable but restarting nodes leads to a lot
of mayhem. It seems we can get the cluster back up and running by clearing ZK and restarting
everything (another issue) but replication becomes impossible for some nodes leading to a
continuous state of failing recovery etc.
> > 
> > Here are some excepts from the logs:
> > 
> > 2012-10-30 16:12:39,674 ERROR [solr.servlet.SolrDispatchFilter] - [http-8080-exe
> > c-5] - : null:java.lang.IndexOutOfBoundsException
> >        at java.nio.Buffer.checkBounds(Buffer.java:530)
> >        at java.nio.DirectByteBuffer.get(DirectByteBuffer.java:218)
> >        at org.apache.lucene.store.ByteBufferIndexInput.readBytes(ByteBufferInde
> > xInput.java:91)
> >        at org.apache.solr.handler.ReplicationHandler$DirectoryFileStream.write(
> > ReplicationHandler.java:1065)
> >        at org.apache.solr.handler.ReplicationHandler$3.write(ReplicationHandler.java:932)
> > 
> > 
> > 2012-10-30 16:10:32,220 ERROR [solr.handler.ReplicationHandler] - [RecoveryThrea
> > d] - : SnapPull failed :org.apache.solr.common.SolrException: Unable to download
> > _x.fdt completely. Downloaded 13631488!=13843504
> >        at org.apache.solr.handler.SnapPuller$DirectoryFileFetcher.cleanup(SnapP
> > uller.java:1237)
> >        at org.apache.solr.handler.SnapPuller$DirectoryFileFetcher.fetchFile(Sna
> > pPuller.java:1118)
> >        at org.apache.solr.handler.SnapPuller.downloadIndexFiles(SnapPuller.java
> > :716)
> >        at org.apache.solr.handler.SnapPuller.fetchLatestIndex(SnapPuller.java:387)
> >        at org.apache.solr.handler.ReplicationHandler.doFetch(ReplicationHandler.java:273)
> >        at org.apache.solr.cloud.RecoveryStrategy.replicate(RecoveryStrategy.java:152)
> >        at org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:407)
> > 
> > 2012-10-30 16:12:51,061 WARN [solr.handler.ReplicationHandler] - [http-8080-exec
> > -3] - : Exception while writing response for params: file=_p_Lucene41_0.doc&comm
> > and=filecontent&checksum=true&generation=6&qt=/replication&wt=filestream
> > java.io.EOFException: read past EOF: MMapIndexInput(path="/opt/solr/cores/openindex_h/data/index.20121030152234973/_p_Lucene41_0.doc")
> >        at org.apache.lucene.store.ByteBufferIndexInput.readBytes(ByteBufferIndexInput.java:100)
> >        at org.apache.solr.handler.ReplicationHandler$DirectoryFileStream.write(ReplicationHandler.java:1065)
> >        at org.apache.solr.handler.ReplicationHandler$3.write(ReplicationHandler.java:932)
> > 
> > 
> > Needless to say i'm puzzled so i'm wondering if anyone has seen this before or have
some hints that might help digg further.
> > 
> > Thanks,
> > Markus
> 
> 

Mime
View raw message