lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Muir <rcm...@gmail.com>
Subject Re: Lucene Version Upgrade (3->4) and Java JVM Versions(6->8)
Date Wed, 11 Feb 2015 22:05:26 GMT
No, because you only looked into one bug. We have seen and do so see
many G1 related test failures, including latest 1.8.0 update 40 early
access editions. These include things like corruption.

I added this message with *every intention* to scare away users,
because I don't want them having index corruption.

I am sick of people asking "but isnt it fine on the latest version"
and so on. It is not.

On Wed, Feb 11, 2015 at 11:41 AM, McKinley, James T
<james.mckinley@cengage.com> wrote:
> Hi,
>
> A couple mailing list members have brought the following paragraph from the https://wiki.apache.org/lucene-java/JavaBugs
page to my attention:
>
> "Do not, under any circumstances, run Lucene with the G1 garbage collector. Lucene's
test suite fails with the G1 garbage collector on a regular basis, including bugs that cause
index corruption. There is no person on this planet that seems to understand such bugs (see
https://bugs.openjdk.java.net/browse/JDK-8038348, open for over a year), so don't count on
the situation changing soon. This information is not out of date, and don't think that the
next oracle java release will fix the situation."
>
> Since we run Lucene 4.8.1 on Java(TM) SE Runtime Environment (build 1.7.0_04-b20) Java
HotSpot(TM) 64-Bit Server VM (build 23.0-b21, mixed mode) using G1GC in production I felt
I should look into the issue and see if it is reproducible in our environment.  First I read
the bug linked in the above paragraph as well as https://issues.apache.org/jira/browse/LUCENE-5168
and it appears quite a bit of work in trying to track down this bug has already been done
by Dawid Weiss and Vladmir Kozlov but it seems it is limited to the 32-bit JVM (maybe even
only on Windows), to quote Dawid Weiss from the Jira bug:
>
> "My quest continues
>
> I thought it'd be interesting to see how far back I can trace this
> issue. I fetched the official binaries for jdk17 (windows, 32-bit) and
> did a binary search with the failing Lucene test command. The results
> show that, in short:
>
> ...
> jdk1.7.0_03: PASSES
> jdk1.7.0_04: FAILS
> ...
>
> and are consistent before and after. jdk1.7.0_04, 64-bit does *NOT*
> exhibit the issue (and neither does any version afterwards, it only
> happens on 32-bit; perhaps it's because of smaller number of available
> registers and the need to spill?).
>
> jdk1.7.0_04 was when G1GC was "officially" made supported but I don't
> think this plays a big difference. I'll see if I can bsearch on
> mercurial revisions to see which particular revision introduced the
> problem. Anyway, the problem has to be a long-standing issue and not a
> regression. Which makes it even more interesting I guess.
>
> Dawid"
>
> In addition the second to last comment in the LUCENE-5168 bug is "I don't think this
is closely related to G1GC. It looks more that G1GC happily triggers this bug in this special
case."
>
> Just to make sure the bug wasn't reproducible with our specific environment I checked
out the tag for Lucene 4.8.1 (http://svn.apache.org/repos/asf/lucene/dev/tags/lucene_solr_4_8_1)
and made the following change to common-build.xml:
>
> gada@C006129:~/workspace-java/lucene_solr_4_8_1/lucene$ svn diff common-build.xml
> Index: common-build.xml
> ===================================================================
> --- common-build.xml    (revision 1658458)
> +++ common-build.xml    (working copy)
> @@ -92,7 +92,7 @@
>    </path>
>
>    <!-- default arguments to pass to JVM executing tests -->
> -  <property name="args" value=""/>
> +  <property name="args" value="-XX:+UnlockDiagnosticVMOptions -XX:+UseG1GC -XX:MaxGCPauseMillis=100
-XX:InitiatingHeapOccupancyPercent=65 -XX:ParallelGCThreads=12 -verbose:gc -XX:+PrintGC -XX:+PrintGCDetails
-XX:+PrintGCDateStamps -XX:+PrintAdaptiveSizePolicy -XX:+PrintGCApplicationStoppedTime -XX:+PrintGCApplicationConcurrentTime
-Xloggc:/home/gada/tmp/lucene-test-gc.log -XX:LogFile=/home/gada/tmp/lucene-test-vmop.log
-XX:+LogVMOutput -XX:+PrintSafepointStatistics -XX:PrintSafepointStatisticsCount=1"/>
>
>    <property name="tests.seed" value="" />
>
> I then ran the following script:
>
> #!/bin/bash
> count=0
> while ant test ; do
>         count=$[$count +1]
>         printf "\n\n\nrun $count completed without errors\n\n\n"
>         if [ "$count" -ge 100 ]; then
>                 break
>         fi
>         sleep 1
> done
>
> All tests ran successfully 100 times in a row on a dual 6-core CPU Intel Xeon Lenovo
C30 ThinkStation with 64GB RAM running the Ubuntu 14.04 LTS Linux distribution.  I also successfully
ran the test suite a few times on Java(TM) SE Runtime Environment (build 1.7.0_55-b13) Java
HotSpot(TM) 64-Bit Server VM (build 24.55-b03, mixed mode) since I had it available.
>
> TL;DR:
>
> I think perhaps the sentence: "Do not, under any circumstances, run Lucene with the G1
garbage collector." is a bit too strong.  Maybe a more balanced statement is in order?  For
example, "we've found that the OpenJDK/Oracle 32-bit JVM (if only on Windows, say only on
Windows) has a bug that when used in combination with the the G1 garbage collector causes
incorrect code to be produced possibly resulting in index corruption", or something along
those lines.  It seems a shame to possibly scare new Lucene users away from using G1GC with
the 64-bit JVM given that it has better performance on large heaps which are becoming more
common today.
>
> FWIW,
> Jim
> ________________________________________
> From: McKinley, James T [james.mckinley@cengage.com]
> Sent: Monday, February 09, 2015 11:00 AM
> To: java-user@lucene.apache.org
> Subject: RE: Lucene Version Upgrade (3->4) and Java JVM Versions(6->8)
>
> OK thanks Erick, I have put a story in our jira backlog to investigate the G1GC issues
with the Lucene test suite.  I don't know if we'll be able to shed any light on the issue,
but since we're using Lucene with Java 7 G1GC, I guess we better investigate it.
>
> Jim
> ________________________________________
> From: Erick Erickson [erickerickson@gmail.com]
> Sent: Saturday, February 07, 2015 2:22 PM
> To: java-user
> Subject: Re: Lucene Version Upgrade (3->4) and Java JVM Versions(6->8)
>
> The G1C1 issue reference by Robert Muir on the Wiki page is at a
> Lucene level. Lucene, of course, is critically important to Solr so
> from that perspective it is about Solr too.
>
> https://wiki.apache.org/lucene-java/JavaBugs
>
> And, I assume, it also applies to your custom app.
>
> FWIW,
> Erick
>
> On Fri, Feb 6, 2015 at 12:10 PM, McKinley, James T
> <james.mckinley@cengage.com> wrote:
>> Just to be clear in case there was any confusion about my previous message regarding
G1GC, we do not use Solr, my team works on a proprietary Lucene-based search engine.  Consequently,
I can't really give any advice regarding Solr with G1GC, but for our uses (so far anyway),
G1GC seems to work well with Lucene.
>>
>> Jim
>> ________________________________________
>> From: Piotr Idzikowski [piotridzikowski@gmail.com]
>> Sent: Friday, February 06, 2015 5:35 AM
>> To: java-user@lucene.apache.org
>> Subject: Re: Lucene Version Upgrade (3->4) and Java JVM Versions(6->8)
>>
>> Hello.
>> A little bit delayed question. But recently I have found this articles:
>> https://wiki.apache.org/solr/SolrPerformanceProblems
>> https://wiki.apache.org/solr/ShawnHeisey#GC_Tuning
>>
>> Especially this part from first url:
>> *Using the ConcurrentMarkSweep (CMS) collector with tuning parameters is a
>> very good option for for Solr, but with the latest Java 7 releases (7u72 at
>> the time of this writing), G1 is looking like a better option, if the
>> -XX:+ParallelRefProcEnabled option is used.*
>>
>> How does it play with *"Do not, under any circumstances, run Lucene with
>> the G1 garbage collector."*
>> from https://wiki.apache.org/lucene-java/JavaBugs?
>>
>> Regards
>> Piotr
>>
>> On Tue, Jan 27, 2015 at 9:55 PM, McKinley, James T <
>> james.mckinley@cengage.com> wrote:
>>
>>> Hi Uwe,
>>>
>>> OK, thanks for the info.  We'll see if we can download the Lucene test
>>> suite and check it out.
>>>
>>> FWIW, we use G1GC in our production runtime (~70 12-16 core Cisco UCS and
>>> HP Gen7/Gen8 nodes with 20+ GB heaps using Java 7 and Lucene 4.8.1 with
>>> pairs of 30 index partitions with 15M-23M docs each) and have not
>>> experienced any VM crashes (well, maybe a couple, but not directly
>>> traceable to G1 to my knowledge).  We have found some undocumented pauses
>>> in G1 due to very large object arrays and filed a bug report which was
>>> confirmed and also affects CMS (we worked around this in our code using
>>> memory mapping of some files whose contents we previously held all in
>>> RAM).  I think the only index corruption we've ever seen was in our index
>>> creation workflow (~30 HP Gen7 nodes with 27GB heaps) but this was using
>>> Parallel GC since it is a batch system, so that corruption (which we've not
>>> seen recently and never found a cause for) was definitely not due to G1GC.
>>>
>>> G1GC has bugs as does CMS but we've found it to work pretty well so far in
>>> our runtime system.  Of course YMMV, thanks again for the info.
>>>
>>> Jim
>>> ________________________________________
>>> From: Uwe Schindler [uwe@thetaphi.de]
>>> Sent: Tuesday, January 27, 2015 3:02 PM
>>> To: java-user@lucene.apache.org
>>> Subject: RE: Lucene Version Upgrade (3->4) and Java JVM Versions(6->8)
>>>
>>> Hi.,
>>>
>>> About G1GC. We consistently see problems when running the Lucene Testsuite
>>> with G1GC enabled. The people from Elasticsearch concluded:
>>>
>>> "There is a newer GC called the Garbage First GC (G1GC). This newer GC is
>>> designed to minimize pausing even more than CMS, and operate on large
>>> heaps. It works by dividing the heap into regions and predicting which
>>> regions contain the most reclaimable space. By collecting those regions
>>> first (garbage first), it can minimize pauses and operate on very large
>>> heaps.
>>>
>>> Sounds great! Unfortunately, G1GC is still new, and fresh bugs are found
>>> routinely. These bugs are usually of the segfault variety, and will cause
>>> hard crashes. The Lucene test suite is brutal on GC algorithms, and it
>>> seems that G1GC hasn’t had the kinks worked out yet.
>>>
>>> We would like to recommend G1GC someday, but for now, it is simply not
>>> stable enough to meet the demands of Elasticsearch and Lucene."
>>> (
>>> http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/_don_8217_t_touch_these_settings.html
>>> )
>>>
>>> In fact, the problems with G1GC can sometimes lead to index corruption,
>>> and are hard to reproduce. So better don't use...
>>>
>>> Uwe
>>>
>>> -----
>>> Uwe Schindler
>>> H.-H.-Meier-Allee 63, D-28213 Bremen
>>> http://www.thetaphi.de
>>> eMail: uwe@thetaphi.de
>>>
>>>
>>> > -----Original Message-----
>>> > From: McKinley, James T [mailto:james.mckinley@cengage.com]
>>> > Sent: Tuesday, January 27, 2015 8:58 PM
>>> > To: java-user@lucene.apache.org
>>> > Subject: RE: Lucene Version Upgrade (3->4) and Java JVM Versions(6->8)
>>> >
>>> > Why do you say not to use G1GC?  We are using Java 7 & G1GC with Lucene
>>> > 4.8.1 in production.  Thanks.
>>> >
>>> > Jim
>>> > ________________________________________
>>> > From: Uwe Schindler [uwe@thetaphi.de]
>>> > Sent: Tuesday, January 27, 2015 2:49 PM
>>> > To: java-user@lucene.apache.org; 'kiwi clive'
>>> > Subject: RE: Lucene Version Upgrade (3->4) and Java JVM Versions(6->8)
>>> >
>>> > Java 8 update 20 or later is also fine. At current time, always use
>>> latest update
>>> > release and you are be fine with Java 7 and Java 8. Don't use older
>>> releases
>>> > and don't use G1 Garbage Collector.
>>> >
>>> > -----
>>> > Uwe Schindler
>>> > H.-H.-Meier-Allee 63, D-28213 Bremen
>>> > http://www.thetaphi.de
>>> > eMail: uwe@thetaphi.de
>>> >
>>> >
>>> > > -----Original Message-----
>>> > > From: kiwi clive [mailto:kiwi_clive@yahoo.com.INVALID]
>>> > > Sent: Tuesday, January 27, 2015 8:03 PM
>>> > > To: java-user@lucene.apache.org
>>> > > Subject: Re: Lucene Version Upgrade (3->4) and Java JVM Versions(6->8)
>>> > >
>>> > > Hi Hoss,
>>> > > Many thanks for the information. This looks very encouraging as the
>>> > > Java7 bug I remember  was fixed and as far as I know, we should not
be
>>> > > affected by the others.
>>> > > I'll put a few tests together and put my toe in the water :-) Clive
>>> > >
>>> > >       From: Chris Hostetter <hossman_lucene@fucit.org>
>>> > >  To: "java-user@lucene.apache.org" <java-user@lucene.apache.org>;
kiwi
>>> > > clive <kiwi_clive@yahoo.com>
>>> > >  Sent: Tuesday, January 27, 2015 4:01 PM
>>> > >  Subject: Re: Lucene Version Upgrade (3->4) and Java JVM
>>> > > Versions(6->8)
>>> > >
>>> > >
>>> > >
>>> > >
>>> > > : I seem to remember reading that certain versions of lucene were
>>> > > : incompatible with some java versions although I cannot find anything
>>> > > to
>>> > > : verify this. As we have tens of thousands of large indexes,
>>> > > backwards
>>> > > : compatibility without the need to reindex on an upgrade is of prime
>>> > > : importance to us.
>>> > >
>>> > > All known JVM bugs affecting Lucene are listed here...
>>> > >
>>> > > https://wiki.apache.org/lucene-java/JavaBugs
>>> > >
>>> > >
>>> > > -Hoss
>>> > > http://www.lucidworks.com/
>>> > >
>>> > > ---------------------------------------------------------------------
>>> > > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> > > For additional commands, e-mail: java-user-help@lucene.apache.org
>>> > >
>>> > >
>>> > >
>>> > >
>>> >
>>> >
>>> > ---------------------------------------------------------------------
>>> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> > For additional commands, e-mail: java-user-help@lucene.apache.org
>>> >
>>> >
>>> > ---------------------------------------------------------------------
>>> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> > For additional commands, e-mail: java-user-help@lucene.apache.org
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message