lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jan Høydahl <jan....@cominvent.com>
Subject Re: Reporting multiple issues triggering an HTTP 500 in Solr
Date Thu, 31 Jan 2019 12:37:31 GMT
Thanks for the reports, it will be good to get rid of NPEs even if the input values are not
likely to happen IRL.
Reading your blog post, at the bottom you have a table with various kinds of exceptions triggering
500 responses.
Out of those, most are expected and wanted errors in my opinion, such as NumberFormatException
or IllegalArgumentException,
they clearly tell the client that their input is wrong. Also, SolrException is in most cases
thrown to explain what went wrong.

But for NPE, ClassCastException, AIOOB and SIOOB exceptions are good candidates for better
input validation and fixing.

BTW, here is a JIRA query that will list all your reported issues, in case someone want to
fix multiple of them in one single commit
https://issues.apache.org/jira/browse/SOLR-13180?jql=project%20%3D%20SOLR%20AND%20text%20~%20diffblue

You may also consider adding a label=diffblue as well to display all issues with a click,
as well as label=newdev to signal that these are excellent tasks to be done by new Solr/Lucene
developers.

Instead of attaching large home.zip archives to the issues, it would be much more helpful
with a few simple steps as {code} block in the JIRA to reproduce, e.g.

bin/solr start -c
bin/solr create -c films
curl -X POST -H 'Content-type:application/json' --data-binary '{"add-field": {"name":"name",
"type":"text_general", "multiValued":false, "stored":true}}' http://localhost:8983/solr/films/schema
bin/post -c films example/films/films.json
curl "http://localhost:8983/solr/films/select?json=0"

Note that the last line there reproduces the exception.
This allows any developer to simply copy/paste those five lines to reproduce :)

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com

> 28. jan. 2019 kl. 17:06 skrev César Rodríguez <cesar.rodriguez@diffblue.com>:
> 
> Thanks, we will do that.
> 
> Just to be clear, we are talking about opening from 50 to 70 jira
> tickets. We found 77 unique points in the source code where an
> exception is thrown that causes an HTTP 500, but I'm guessing that
> some of them will not be serious enough to be reported.
> 
> We can provide patches for the two issues described below. We will do
> our best to describe the probable cause of the error on each
> individual report, but we won't be able to provide patches for most of
> them.
> 
> More information about this testing effort can be found here:
> https://www.diffblue.com/blog/2018/12/19/diffblue-microservice-testing-a-sneak-peek-at-our-early-product-and-results
> 
> On Mon, Jan 28, 2019 at 3:03 PM Mikhail Khludnev <mkhl@apache.org> wrote:
>> 
>> Yes. Please create jiras and attach patches. Tests are highly appreciated.
>> 
>> 
>> On Mon, Jan 28, 2019 at 4:49 PM César Rodríguez <cesar.rodriguez@diffblue.com>
wrote:
>>> 
>>> Hi there,
>>> 
>>> We analyzed the source code of Apache Solr and found a number of
>>> issues that we would like to report. We configured Solr using the
>>> films collection from the quick start tutorial [2]. For every issue
>>> that we found we can provide a request URL for which Solr returns an
>>> HTTP 500 error (usually due to an uncaught exception).
>>> 
>>> Please find below two example issues we found, together with an
>>> explanation of the cause and a patch fixing the bug (attached patches
>>> pass the unit tests). The issues we would like to report are similar
>>> to those. We found them with help from Diffblue Microservice Testing
>>> [1].
>>> 
>>> We would like to know:
>>> * Should we open JIRA tickets for each one of them?
>>> * We can provide a URL that triggers the problem, a stack trace, and
>>> information about the server setup. What other information would you
>>> like to see?
>>> 
>>> I look forward to your response.
>>> 
>>> Best regards,
>>> César
>>> 
>>> PS. If this question is not suitable for this list, please indicate
>>> where to follow up.
>>> 
>>> === Issue 1: Null pointer exception due to unhandled case in
>>> ComplexPhraseQueryParser  ===
>>> 
>>> Assume that we request the following URL:
>>> 
>>> /solr/films/select?q={!complexphrase}genre:"-om*"
>>> 
>>> Handling this query involves constructing a SpanQuery, which happens
>>> in the rewrite method of ComplexPhraseQueryParser. In particular, the
>>> expression is decomposed into a BooleanQuery, which has exactly one
>>> clause, namely the negative clause -genre:”om*”. The rewrite method
>>> then further transforms this into a SpanQuery; in this case, it goes
>>> into the path that handles complex queries with both positive and
>>> negative clauses. It extracts the subset of positive clauses - note
>>> that this set of clauses is empty for this query. The positive clauses
>>> are then combined into a SpanNearQuery (around line 340), which is
>>> then used to build a SpanNotQuery.
>>> 
>>> Further down the line, the field attribute of the SpanNearQuery is
>>> accessed and used as an index into a TreeMap. But since we had an
>>> empty set of positive clauses, the SpanNearQuery does not have its
>>> field attribute set, so we get a null here - this leads to an
>>> exception.
>>> 
>>> A possible fix would be to detect the situation where we have an empty
>>> set of positive clauses and include a single synthetic clause that
>>> matches either everything or nothing. See attached file
>>> 0001-Fix-NullPointerException.patch.
>>> 
>>> === Issue 2: StringIndexOutOfBoundsException when expanding macros ===
>>> 
>>> Assume that we request the following URL:
>>> 
>>> /solr/films/select?a=${${b}}
>>> 
>>> Parameter macro expansion [3]  seems to take place in
>>> org.apache.solr.request.macro.MacroExpander._expand(String val).
>>> However, this method throws a StringIndexOutOfBoundsException for the
>>> URL above. From reading the code it seems that macros are not expanded
>>> inside curly brackets ${...}, and so the “${b}” inside “${${b}}”
>>> should not be expanded. But the function seems to fail to detect this
>>> specific case and graciously refuse to expand it. Instead, it
>>> unexpectedly throws the exception. A possible fix could be updating
>>> the ‘idx’ variable when the StrParser detects that no valid identifier
>>> can be found inside the brackets. See attached file
>>> 0001-Macro-expander-fail-gracefully-on-unsupported-syntax.patch.
>>> 
>>> References:
>>> [1] https://www.diffblue.com/labs/
>>> [2] https://lucene.apache.org/solr/guide/7_6/solr-tutorial.html
>>> [3] http://yonik.com/solr-query-parameter-substitution/
>>> 
>>> --
>>> Diffblue Limited, a company registered in England and Wales number
>>> 09958102, with its registered office at Ramsey House, 10 St. Ebbes Street,
>>> Oxford, OX1 1PT, England
>>> 
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: dev-help@lucene.apache.org
>> 
>> 
>> 
>> --
>> Sincerely yours
>> Mikhail Khludnev
>> 
>> Diffblue Limited, a company registered in England and Wales number 09958102, with
its registered office at Ramsey House, 10 St. Ebbes Street, Oxford, OX1 1PT, England
> 
> -- 
> Diffblue Limited, a company registered in England and Wales number 
> 09958102, with its registered office at Ramsey House, 10 St. Ebbes Street, 
> Oxford, OX1 1PT, England
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
> 


Mime
View raw message