lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael McCandless <luc...@mikemccandless.com>
Subject Re: TestRegexpRandom2
Date Mon, 26 Jul 2010 17:49:53 GMT
OK lemme try...

Mike

On Mon, Jul 26, 2010 at 1:47 PM, Robert Muir <rcmuir@gmail.com> wrote:
> maybe you can try this on your beast computer for a while? I think its
> better.
> Index: lucene/src/test/org/apache/lucene/search/TestRegexpRandom2.java
> ===================================================================
> --- lucene/src/test/org/apache/lucene/search/TestRegexpRandom2.java
> (revision 979377)
> +++ lucene/src/test/org/apache/lucene/search/TestRegexpRandom2.java (working
> copy)
> @@ -86,9 +86,9 @@
>    private class DumbRegexpQuery extends MultiTermQuery {
>      private final Automaton automaton;
>
> -    DumbRegexpQuery(Term term) {
> +    DumbRegexpQuery(Term term, int flags) {
>        super(term.field());
> -      RegExp re = new RegExp(term.text());
> +      RegExp re = new RegExp(term.text(), flags);
>        automaton = re.toAutomaton();
>      }
>
> @@ -130,8 +130,8 @@
>     * simple regexpquery implementation.
>     */
>    private void assertSame(String regexp) throws IOException {
> -    RegexpQuery smart = new RegexpQuery(new Term("field", regexp));
> -    DumbRegexpQuery dumb = new DumbRegexpQuery(new Term("field", regexp));
> +    RegexpQuery smart = new RegexpQuery(new Term("field", regexp),
> RegExp.NONE);
> +    DumbRegexpQuery dumb = new DumbRegexpQuery(new Term("field", regexp),
> RegExp.NONE);
>
>      // we can't compare the two if automaton rewrites to a simpler enum.
>      // for example: "a\uda07\udcc7?.*?" gets rewritten to a simpler query:
> Index:
> lucene/src/test/org/apache/lucene/util/automaton/AutomatonTestUtil.java
> ===================================================================
> --- lucene/src/test/org/apache/lucene/util/automaton/AutomatonTestUtil.java
> (revision 979377)
> +++ lucene/src/test/org/apache/lucene/util/automaton/AutomatonTestUtil.java
> (working copy)
> @@ -40,7 +40,9 @@
>        if (!UnicodeUtil.validUTF16String(regexp))
>          continue;
>        try {
> -        return new RegExp(regexp, RegExp.NONE);
> +        // NOTE: we parse-tostring-parse again, because we are
> +        // really abusing RegExp.toString() here (its just for debugging)
> +        return new RegExp(new RegExp(regexp, RegExp.NONE).toString(),
> RegExp.NONE);
>        } catch (Exception e) {}
>      }
>    }
>
> On Mon, Jul 26, 2010 at 1:32 PM, Michael McCandless
> <lucene@mikemccandless.com> wrote:
>>
>> Hmm that doesn't fix it.
>>
>> The magical seed is: 50686536365145364L (for TRR2).
>>
>> I still see the IAE even if I remove that RegExp.NONE.
>>
>> Mike
>>
>> On Mon, Jul 26, 2010 at 1:28 PM, Robert Muir <rcmuir@gmail.com> wrote:
>> > I don't think we should do that. I think i found the problem:
>> > AutomatonTestUtil's randomRegexp does this:
>> >       try {
>> >         return new RegExp(regexp, RegExp.NONE);
>> >       } catch (Exception e) {}
>> > I think the RegExp.NONE is the problem, and we should remove it.
>> > because the test then toString's it and compiles it again, but without
>> > this
>> > option.
>> > On Mon, Jul 26, 2010 at 1:16 PM, Michael McCandless
>> > <lucene@mikemccandless.com> wrote:
>> >>
>> >> My random stress testing hit an IllegalArgExc because the random
>> >> regexp was malformed.
>> >>
>> >> Does this patch look OK to fix?
>> >>
>> >> Index: src/test/org/apache/lucene/search/TestRegexpRandom2.java
>> >> ===================================================================
>> >> --- src/test/org/apache/lucene/search/TestRegexpRandom2.java
>> >>  (revision
>> >> 979227)
>> >> +++ src/test/org/apache/lucene/search/TestRegexpRandom2.java
>> >>  (working
>> >> copy)
>> >> @@ -130,6 +130,14 @@
>> >>    * simple regexpquery implementation.
>> >>    */
>> >>   private void assertSame(String regexp) throws IOException {
>> >> +    try {
>> >> +      new RegExp(regexp);
>> >> +    } catch (IllegalArgumentException iae) {
>> >> +      // the random regexp could be malformed, eg "foo"bar",
>> >> +      // so we ignore this
>> >> +      return;
>> >> +    }
>> >> +
>> >>     RegexpQuery smart = new RegexpQuery(new Term("field", regexp));
>> >>     DumbRegexpQuery dumb = new DumbRegexpQuery(new Term("field",
>> >> regexp));
>> >>
>> >> ---------------------------------------------------------------------
>> >> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> >> For additional commands, e-mail: dev-help@lucene.apache.org
>> >>
>> >
>> >
>> >
>> > --
>> > Robert Muir
>> > rcmuir@gmail.com
>> >
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: dev-help@lucene.apache.org
>>
>
>
>
> --
> Robert Muir
> rcmuir@gmail.com
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message