lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF subversion and git services (JIRA)" <>
Subject [jira] [Commented] (LUCENE-7914) Add safeguards to RegExp.toAutomaton and Operations
Date Fri, 04 Aug 2017 10:14:00 GMT


ASF subversion and git services commented on LUCENE-7914:

Commit 3610c8d1b8927fdb61667ea4c49f983bbe13a404 in lucene-solr's branch refs/heads/branch_7x
from [~jimczi]
[;h=3610c8d ]

LUCENE-7914: Add a maximum recursion level in automaton recursive functions (Operations.isFinite
and Operations.topsortState) to prevent large automaton to overflow the stack.

> Add safeguards to RegExp.toAutomaton and Operations
> ---------------------------------------------------
>                 Key: LUCENE-7914
>                 URL:
>             Project: Lucene - Core
>          Issue Type: Bug
>            Reporter: Jim Ferenczi
>             Fix For: master (8.0), 7.1
>         Attachments: LUCENE-7914.patch, LUCENE-7914.patch, LUCENE-7914.patch, LUCENE-7914.patch,
> When creating an automaton from a regexp, some operators can create more states than
> {code}
> a{10000}
> {code}
> The example above <b>creates</b> a single path with 10k states already determinized
so maxDeterminizedStates is not checked. 
> Some operations on automaton like Operations.isFinite or Operations.topoSortStates are
recursive and the maximum level of recursion depends on the longest path in the automaton.
So a large automaton like above can exceed java's stack.
> In most of the cases we are covered by maxDeterminizedStates but there will always be
adversarial cases where a large automaton is created from a small input so I think we should
also have safeguards in the recursive methods. 
> I've attached a patch that adds a max recursion level to Operations.isFinite and Operations.topoSortStates
in order to limit stack overflows. The limit is set to 1000 so any automaton with a path bigger
than 1000 would throw an IllegalStateException.
> The patch also uses maxDeterminizedStates to limit the number of states that a repeat
operator can create and throw a TooComplex..Exception when this limit is reached.
> Finally the patch adds the ability to skip Operations.isFinite on AutomatonQuery and
uses this as an optimization for PrefixQuery that uses infinite automatons only.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message