lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF subversion and git services (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SOLR-4531) corrupted tlog causes recovery failed
Date Tue, 25 Oct 2016 06:43:58 GMT

    [ https://issues.apache.org/jira/browse/SOLR-4531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15604405#comment-15604405
] 

ASF subversion and git services commented on SOLR-4531:
-------------------------------------------------------

Commit 529654c02be519accf0e83a5919570fa2d9bc69d in lucene-solr's branch refs/heads/branch_6x
from [~shalinmangar]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=529654c ]

SOLR-4531: Add tests to ensure that recovery does not fail on corrupted tlogs

(cherry picked from commit b7aa582)


> corrupted tlog causes recovery failed
> -------------------------------------
>
>                 Key: SOLR-4531
>                 URL: https://issues.apache.org/jira/browse/SOLR-4531
>             Project: Solr
>          Issue Type: Bug
>          Components: SolrCloud
>    Affects Versions: 4.0
>            Reporter: Simon Scofield
>         Attachments: SOLR-4531.patch, SOLR-4531.patch
>
>
> One of the solr nodes in our SolrCloud was killed. It caused tlog was corrupted. Now
the node can't finish recoverying. There is an excepion:
> Caused by: java.lang.IndexOutOfBoundsException: Index: 14, Size: 13
> 	at java.util.ArrayList.RangeCheck(ArrayList.java:547)
> 	at java.util.ArrayList.get(ArrayList.java:322)
> 	at org.apache.solr.update.TransactionLog$LogCodec.readExternString(TransactionLog.java:128)
> 	at org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:188)
> 	at org.apache.solr.common.util.JavaBinCodec.readOrderedMap(JavaBinCodec.java:120)
> 	at org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:184)
> 	at org.apache.solr.common.util.JavaBinCodec.readArray(JavaBinCodec.java:451)
> 	at org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:182)
> 	at org.apache.solr.common.util.JavaBinCodec.readOrderedMap(JavaBinCodec.java:121)
> 	at org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:184)
> 	at org.apache.solr.common.util.JavaBinCodec.readArray(JavaBinCodec.java:451)
> 	at org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:182)
> 	at org.apache.solr.common.util.JavaBinCodec.readArray(JavaBinCodec.java:451)
> 	at org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:182)
> 	at org.apache.solr.update.TransactionLog$ReverseReader.next(TransactionLog.java:708)
> 	at org.apache.solr.update.UpdateLog$RecentUpdates.update(UpdateLog.java:906)
> 	at org.apache.solr.update.UpdateLog$RecentUpdates.access$000(UpdateLog.java:846)
> 	at org.apache.solr.update.UpdateLog.getRecentUpdates(UpdateLog.java:996)
> 	at org.apache.solr.update.UpdateLog.init(UpdateLog.java:241)
> 	at org.apache.solr.update.UpdateHandler.initLog(UpdateHandler.java:94)
> 	at org.apache.solr.update.UpdateHandler.<init>(UpdateHandler.java:123)
> 	at org.apache.solr.update.DirectUpdateHandler2.<init>(DirectUpdateHandler2.java:97)
> 	... 31 more
> I check the code in UpdateLog.java. I find that only IOException is catched when the
above expception happens.
> {code:title=solr\\core\\src\\java\\org\\apache\\solr\\update\\UpdateLog.java|borderStyle=solid}
>     private void update() {
>       int numUpdates = 0;
>       updateList = new ArrayList<List<Update>>(logList.size());
>       deleteByQueryList = new ArrayList<Update>();
>       deleteList = new ArrayList<DeleteUpdate>();
>       updates = new HashMap<Long,Update>(numRecordsToKeep);
>       for (TransactionLog oldLog : logList) {
>         List<Update> updatesForLog = new ArrayList<Update>();
>         TransactionLog.ReverseReader reader = null;
>         try {
>           reader = oldLog.getReverseReader();
>           while (numUpdates < numRecordsToKeep) {
>             Object o = reader.next();
>             if (o==null) break;
>             try {
>               // should currently be a List<Oper,Ver,Doc/Id>
>               List entry = (List)o;
>               // TODO: refactor this out so we get common error handling
>               int opAndFlags = (Integer)entry.get(0);
>               if (latestOperation == 0) {
>                 latestOperation = opAndFlags;
>               }
>               int oper = opAndFlags & UpdateLog.OPERATION_MASK;
>               long version = (Long) entry.get(1);
>               switch (oper) {
>                 case UpdateLog.ADD:
>                 case UpdateLog.DELETE:
>                 case UpdateLog.DELETE_BY_QUERY:
>                   Update update = new Update();
>                   update.log = oldLog;
>                   update.pointer = reader.position();
>                   update.version = version;
>                   updatesForLog.add(update);
>                   updates.put(version, update);
>                   
>                   if (oper == UpdateLog.DELETE_BY_QUERY) {
>                     deleteByQueryList.add(update);
>                   } else if (oper == UpdateLog.DELETE) {
>                     deleteList.add(new DeleteUpdate(version, (byte[])entry.get(2)));
>                   }
>                   
>                   break;
>                 case UpdateLog.COMMIT:
>                   break;
>                 default:
>                   throw new SolrException(SolrException.ErrorCode.SERVER_ERROR,  "Unknown
Operation! " + oper);
>               }
>             } catch (ClassCastException cl) {
>               log.warn("Unexpected log entry or corrupt log.  Entry=" + o, cl);
>               // would be caused by a corrupt transaction log
>             } catch (Exception ex) {
>               log.warn("Exception reverse reading log", ex);
>               break;
>             }
>           }
>         } catch (IOException e) {
>           // failure to read a log record isn't fatal
>           log.error("Exception reading versions from log",e);
>         } finally {
>           if (reader != null) reader.close();
>         }
>         updateList.add(updatesForLog);
>       }
>     }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message