jackrabbit-oak-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ian Boston (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (OAK-3547) Improve ability of the OakDirectory to recover from unexpected file errors
Date Wed, 04 Nov 2015 10:56:27 GMT

    [ https://issues.apache.org/jira/browse/OAK-3547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14989329#comment-14989329
] 

Ian Boston commented on OAK-3547:
---------------------------------

Attempts to break the index operation have produced recovery behaviour that appears stable.

{code}
04.11.2015 10:02:20.391 *INFO* [aysnc-index-update-async] org.apache.jackrabbit.oak.plugins.index.lucene.OakDirectory$GenerationalDirectoryListing
Saving Listing state to l_1446631340351 as _3.si,229,1a1c2c541a3cb5087d2a3c60ae77b6b05461410d;segments_3,117,a94bece6be54b2690a3b11925ed9bbc80e914d5d;segments_4,117,a74de5ae54af2abbb2423573cd0ad93b950ec18a;_2.cfe,224,dd2d758773e57e172933ff3d3fc3a4908af59dc4;_0_1.del,36,906f2506ff277e28716cc19eb8b55f289e34c53c;_0.si,229,57e0616d14993ad2a9680f55d4151d440cad8255;_2.si,229,ceddd18aa9343666e78f6330d0e261e96474717b;segments.gen,20,395c2b9ba7f05f4debb52b0a7cea8ac56ad671a2;_3.cfe,224,2596860e7bcdd550e221488708afda8729689107;_0.cfs,1431868,a7368bb6e2398a5952eddbc062a498f100a29865;_0.cfe,266,62127a60fe2224e32e3720eb15b2bd9f34d4670a;_3.cfs,1210,1c7764b0713716c53d1b0ed21c0063c5606aad49;_2.cfs,1188,1cd8b738619350e6998b77c2142d5a3748e861f1;

04.11.2015 10:02:20.391 *INFO* [aysnc-index-update-async] org.apache.jackrabbit.oak.plugins.index.lucene.OakDirectory$GenerationalDirectoryListing
Saving due to close.
04.11.2015 10:02:20.427 *INFO* [aysnc-index-update-async] org.apache.jackrabbit.oak.plugins.index.lucene.OakDirectory$GenerationalDirectoryListing
Saving Listing state to l_1446631340391 as _3.si,229,1a1c2c541a3cb5087d2a3c60ae77b6b05461410d;segments_4,117,a74de5ae54af2abbb2423573cd0ad93b950ec18a;_0_1.del,36,906f2506ff277e28716cc19eb8b55f289e34c53c;_0.si,229,57e0616d14993ad2a9680f55d4151d440cad8255;segments.gen,20,395c2b9ba7f05f4debb52b0a7cea8ac56ad671a2;_3.cfe,224,2596860e7bcdd550e221488708afda8729689107;_0.cfs,1431868,a7368bb6e2398a5952eddbc062a498f100a29865;_0.cfe,266,62127a60fe2224e32e3720eb15b2bd9f34d4670a;_3.cfs,1210,1c7764b0713716c53d1b0ed21c0063c5606aad49;

04.11.2015 10:02:23.375 *WARN* [aysnc-index-update-fulltext-async] org.apache.jackrabbit.oak.plugins.index.lucene.OakDirectory$GenerationalDirectoryListing
IO Exception reading index file 
	at org.apache.jackrabbit.oak.plugins.index.lucene.OakDirectory$GenerationalDirectoryListing.getIndexFileMetaData(OakDirectory.java:965)
	at org.apache.jackrabbit.oak.plugins.index.lucene.OakDirectory$GenerationalDirectoryListing.validateListing(OakDirectory.java:844)
	at org.apache.jackrabbit.oak.plugins.index.lucene.OakDirectory$GenerationalDirectoryListing.load(OakDirectory.java:878)
	at org.apache.jackrabbit.oak.plugins.index.lucene.OakDirectory$GenerationalDirectoryListing.<init>(OakDirectory.java:750)
	at org.apache.jackrabbit.oak.plugins.index.lucene.OakDirectory$GenerationalDirectoryListing.<init>(OakDirectory.java:728)
04.11.2015 10:02:23.377 *WARN* [aysnc-index-update-fulltext-async] org.apache.jackrabbit.oak.plugins.index.lucene.OakDirectory$GenerationalDirectoryListing
Index File and Oak Version and not the same Name: _xi.fdt, length:14816174, checksum:ff821a1bde2330f1389c782a777747677206c685
 CheckSum: ff821a1bde2330f1389c782a777747677206c685 != Unable to generate checksum java.lang.RuntimeException:
failed to read block from backend, id b535214bddc090c74a426acaeeb5654140c1be52d4af824f2b759113c8a7bdc6@0,
04.11.2015 10:02:23.377 *WARN* [aysnc-index-update-fulltext-async] org.apache.jackrabbit.oak.plugins.index.lucene.OakDirectory$GenerationalDirectoryListing
Rejected directory listing l_1446631317105 
04.11.2015 10:02:24.104 *INFO* [aysnc-index-update-fulltext-async] org.apache.jackrabbit.oak.plugins.index.lucene.OakDirectory$GenerationalDirectoryListing
Accepted directory listing l_1446631315966, using 
04.11.2015 10:02:24.129 *INFO* [oak-lucene-0] org.apache.jackrabbit.oak.plugins.index.lucene.OakDirectory$GenerationalDirectoryListing
Accepted directory listing l_1446631317105, using 
04.11.2015 10:02:25.034 *INFO* [aysnc-index-update-fulltext-async] org.apache.jackrabbit.oak.plugins.index.lucene.OakDirectory$GenerationalDirectoryListing
Saving Listing state to l_1446631344105 as _xi.tim,7775213,f8e401ff1a95bba1c387a7d00239a9b10c4323ba;_xi.si,326,fcb94171f92573a5fd365b178dda96728c16da50;_xi_3.del,38,703e0294067b659a0f15f8114659b20de3d51385;_1oa.si,235,ddb8cb303fdf4372cc333a980a1755d25d64d6cb;_xi.pos,2748906,154e1ba4078865bc81f35646e36dea906e54b539;_xi.nvd,159060,abacf7b5c17963a24c9f54715ac40c1e2dd85f0b;_xi.fdt,14816174,ff821a1bde2330f1389c782a777747677206c685;_xj.cfe,224,1403e26ddc7e3d005758f3dae6c8bbf50e9a4313;_1o9.cfe,224,f7926636965fdd76c2622d57f3e24a217f230a44;_xi.nvm,46,d09f4ec10424aac4b5a2fe1da422f266aace8bca;_xj.si,232,f119a8cab06c10a0e34d87b92e16bf1b28a688f3;_xi.fdx,1272306,0c62a780b2f3c62af3014bc530d79d8144a8f014;segments.gen,20,7b46fe18999a02b4247cbcd8222034d4a2c9291c;segments_1o6,157,21acb7510ff37ef18452d622d8fcf2cdf90b9f58;_1o9.cfs,1595,d13a5088a06954752ea9ef2cc16510875fecb576;_xi.fnm,67704,fcf53c6f0a6169fe01eb7544d1a0c8118a1ef9a2;_1oa.cfs,1595,2cdbe7f26378deac710369f9c58b612c3c875260;segments_1o7,157,9e4d82f8520c49961c0e306c379bb01b2cbbce98;_xi.doc,2525612,8e52ab108ede6d344dbbf9322dbfee7396d26ab7;_1o9.si,235,58e8707ed138eb234f669aafd5416c7b78385932;_xi.tip,187482,0ead85853f2b7b1eceb4eef8290526a469bafa9c;_xj.cfs,4374,a2cda886ae2be975e61c26b830d85633e222c1dc;_1oa.cfe,224,ceadd06ff7a7ddafffde5e87acd83068924c3170;

04.11.2015 10:03:28.515 *INFO* [FelixStartLevel] org.apache.jackrabbit.oak.plugins.index.lucene.OakDirectory$GenerationalDirectoryListing
Accepted directory listing l_1446606423597, using 
04.11.2015 10:03:28.819 *INFO* [FelixStartLevel] org.apache.jackrabbit.oak.plugins.index.lucene.OakDirectory$GenerationalDirectoryListing
Accepted directory listing l_1446606422099, using 
04.11.2015 10:03:29.436 *INFO* [FelixStartLevel] org.apache.jackrabbit.oak.plugins.index.lucene.OakDirectory$GenerationalDirectoryListing
Accepted directory listing l_1446631317105, using 
04.11.2015 10:03:29.744 *INFO* [FelixStartLevel] org.apache.jackrabbit.oak.plugins.index.lucene.OakDirectory$GenerationalDirectoryListing
Accepted directory listing l_1446631340391, using 
04.11.2015 10:03:29.846 *INFO* [FelixStartLevel] org.apache.jackrabbit.oak.plugins.index.lucene.OakDirectory$GenerationalDirectoryListing
Accepted directory listing l_1446606423630, using 
04.11.2015 10:03:29.867 *INFO* [FelixStartLevel] org.apache.jackrabbit.oak.plugins.index.lucene.OakDirectory$GenerationalDirectoryListing
Accepted directory listing l_1446606423536, using 
04.11.2015 10:03:29.908 *INFO* [FelixStartLevel] org.apache.jackrabbit.oak.plugins.index.lucene.OakDirectory$GenerationalDirectoryListing
Accepted directory listing l_1446606423673, using 
04.11.2015 10:03:37.928 *INFO* [aysnc-index-update-async] org.apache.jackrabbit.oak.plugins.index.lucene.OakDirectory$GenerationalDirectoryListing
Accepted directory listing l_1446631340391, using 
04.11.2015 10:03:42.077 *INFO* [aysnc-index-update-async] org.apache.jackrabbit.oak.plugins.index.lucene.OakDirectory$GenerationalDirectoryListing
Saving due to sync  [_4.cfe, _0.cfs, _0.cfe, _0_1.del, _4.si, _0.si, _4.cfs] .
04.11.2015 10:03:42.108 *INFO* [aysnc-index-update-async] org.apache.jackrabbit.oak.plugins.index.lucene.OakDirectory$GenerationalDirectoryListing
Saving Listing state to l_1446631422077 as _3.si,229,1a1c2c541a3cb5087d2a3c60ae77b6b05461410d;segments_4,117,a74de5ae54af2abbb2423573cd0ad93b950ec18a;_0_1.del,36,906f2506ff277e28716cc19eb8b55f289e34c53c;_4.cfe,224,bdce2bd8251981b9cb72acdc8205466e6c6262d2;_0.si,229,57e0616d14993ad2a9680f55d4151d440cad8255;_4.cfs,1188,a5a0324f5a05dfef2f57f5962240c82e9d62d88a;segments.gen,20,395c2b9ba7f05f4debb52b0a7cea8ac56ad671a2;_3.cfe,224,2596860e7bcdd550e221488708afda8729689107;_0.cfs,1431868,a7368bb6e2398a5952eddbc062a498f100a29865;segments_5,0,da39a3ee5e6b4b0d3255bfef95601890afd80709;_0.cfe,266,62127a60fe2224e32e3720eb15b2bd9f34d4670a;_3.cfs,1210,1c7764b0713716c53d1b0ed21c0063c5606aad49;_4.si,229,e50c79679300f6da51ca69bda351f838ee910015;

{code}

In this instance there appears to be an error reading data from the MongoDB blob store (04.11.2015
10:02:23.375). That causes the SHA1 not to match and first directory listing is not opened
(04.11.2015 10:02:23.377). The second directory listing is used (04.11.2015 10:02:24.104)
and both the reader and the async indexer continue to operate normally without errors. The
error was probably generated by using default write concern and killing the JVM and mongoDB
instances. There will have been dataloss in the index, however, the JVM continued as did both
searching and indexing. 

The overnight test and restarts only caused this behaviour once.

More tests are required.

One observation. Indexing does not start until the Topology heatbeat timeout has passed, which
appears to be about 120s, meaning the index will be 120s behind on starting an instance, until
it catches up.

> Improve ability of the OakDirectory to recover from unexpected file errors
> --------------------------------------------------------------------------
>
>                 Key: OAK-3547
>                 URL: https://issues.apache.org/jira/browse/OAK-3547
>             Project: Jackrabbit Oak
>          Issue Type: Improvement
>          Components: lucene
>    Affects Versions: 1.4
>            Reporter: Ian Boston
>
> Currently if the OakDirectory finds that a file is missing or in some way damaged, and
exception is thrown which impacts all queries using that index, at times making the index
unavailable. This improvement aims to make the OakDirectory recover to a previously ok state
by storing which files were involved in previous states, and giving the code some way of checking
if they are valid.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message