lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kay Wrobel <kwro...@hawkusa.com>
Subject Re: How to debug empty ParsedQuery from Edismax Query Parser
Date Wed, 02 Jan 2019 19:07:05 GMT
Well, I was putting that info out there because I am literally hunting down this issue without
any guidance. The real problem for still is that the Edismax Query Parser behaves abnormally
starting with Version 5 until current giving me empty parsedQuery. Forcing the request through
the Lucene parser was one way I was hoping to get to the bottom of this. Frankly, Multi-Term
seems to be *the* new feature that was introduced in SOLR 5, and so I am jumping to conclusions
here.

I would hate to go as low-level as debugging SOLR source just to find out what is going on
here, but it sure seems that way. By the way, I have tried a multitude of other search terms
(ending in *), like:
602* (works)
602K* (does NOT work)
A3F* (works!, but is also single changing characters, so...)
AC* (works, but MANY results for obvious reasons)
6023* (works)

So again, it seems that as soon as there is more than one character involved and a "word"
is somewhat detected, the parser fails (in my specific case).

I am contemplating going down to the source-code level and debugging this issue; I am a programmer
and should be able to understand some of it. That said, it seems like a very time-consuming
thing to do. One last attempt right now is for me so change some logging level in the SOLR
7 instance and see what it spits out. I changed the following to "DEBUG":
org.apache.solr.search.PayloadCheckQParserPlugin
org.apache.solr.search.SurroundQParserPlugin
org.apache.solr.search.join.ChildFieldValueSourceParser

That didn't add any new information in the log file at all.


> On Jan 2, 2019, at 12:40 PM, Gus Heck <gus.heck@gmail.com> wrote:
> 
> If you mean attach a debugger, solr is just like any other java program.
> Pass in the standard java options at start up to have it listen or connect
> as usual. The port is just a TCP port so ssh tunneling the debugger port
> can bridge the gap with a remote machine (or a vpn).
> 
> That said the prior thread posts makes it sound like we are looking for a
> case where the query parser or something above it is inappropriately eating
> an exception relating to too many terms.
> 
> Did 5.x impose a new or reduced limit there?
> 
> On Wed, Jan 2, 2019, 1:20 PM Kay Wrobel <kwrobel@hawkusa.com wrote:
> 
>> Is there any way I can debug the parser? Especially, the edismax parser
>> which does *not* raise any exception but produces an empty parsedQuery?
>> Please, if anyone can help. I feel very lost and without guidance, and
>> Google search has not provided me with any help at all.
>> 
>>> On Dec 28, 2018, at 9:57 AM, Kay Wrobel <kwrobel@hawkusa.com> wrote:
>>> 
>>> Here are my log entries:
>>> 
>>> SOLR 7.x (non-working)
>>> 2018-12-28 15:36:32.786 INFO  (qtp1769193365-20) [   x:collection1]
>> o.a.s.c.S.Request [collection1]  webapp=/solr path=/select
>> params={q=ac6023*&qf=tm_field_product^21.0&qf=tm_title_field^8.0&EchoParams=all&rows=10&wt=xml&debugQuery=true}
>> hits=0 status=0 QTime=2
>>> 
>>> SOLR 4.x (working)
>>> INFO  - 2018-12-28 15:43:41.938; org.apache.solr.core.SolrCore;
>> [collection1] webapp=/solr path=/select
>> params={q=ac6023*&qf=tm_field_product^21.0&qf=tm_title_field^8.0&EchoParams=all&rows=10&wt=xml&debugQuery=true}
>> hits=32 status=0 QTime=8
>>> 
>>> EchoParams=all did not show anything different in the resulting XML from
>> SOLR 7.x.
>>> 
>>> 
>>> I found out something curious yesterday. When I try to force the
>> Standard query parser on SOLR 7.x using the same query, but adding
>> "defType=lucene" at the beginning, SOLR 7 raises a SolrException with this
>> message: "analyzer returned too many terms for multiTerm term: ac6023"
>> (full response: https://pastebin.com/ijdBj4GF)
>>> 
>>> Log entry for that request:
>>> 2018-12-28 15:50:58.804 ERROR (qtp1769193365-15) [   x:collection1]
>> o.a.s.h.RequestHandlerBase org.apache.solr.common.SolrException: analyzer
>> returned too many terms for multiTerm term: ac6023
>>>       at
>> org.apache.solr.schema.TextField.analyzeMultiTerm(TextField.java:180)
>>>       at
>> org.apache.solr.parser.SolrQueryParserBase.analyzeIfMultitermTermText(SolrQueryParserBase.java:992)
>>>       at
>> org.apache.solr.parser.SolrQueryParserBase.getPrefixQuery(SolrQueryParserBase.java:1173)
>>>       at
>> org.apache.solr.parser.SolrQueryParserBase.handleBareTokenQuery(SolrQueryParserBase.java:781)
>>>       at org.apache.solr.parser.QueryParser.Term(QueryParser.java:421)
>>>       at org.apache.solr.parser.QueryParser.Clause(QueryParser.java:278)
>>>       at org.apache.solr.parser.QueryParser.Query(QueryParser.java:162)
>>>       at
>> org.apache.solr.parser.QueryParser.TopLevelQuery(QueryParser.java:131)
>>>       at
>> org.apache.solr.parser.SolrQueryParserBase.parse(SolrQueryParserBase.java:254)
>>>       at
>> org.apache.solr.search.LuceneQParser.parse(LuceneQParser.java:49)
>>>       at org.apache.solr.search.QParser.getQuery(QParser.java:173)
>>>       at
>> org.apache.solr.handler.component.QueryComponent.prepare(QueryComponent.java:160)
>>>       at
>> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:279)
>>>       at
>> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:199)
>>>       at org.apache.solr.core.SolrCore.execute(SolrCore.java:2541)
>>>       at
>> org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:709)
>>>       at
>> org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:515)
>>>       at
>> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:377)
>>>       at
>> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:323)
>>>       at
>> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1634)
>>>       at
>> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:533)
>>>       at
>> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:146)
>>>       at
>> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
>>>       at
>> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
>>>       at
>> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:257)
>>>       at
>> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1595)
>>>       at
>> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:255)
>>>       at
>> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1317)
>>>       at
>> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:203)
>>>       at
>> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:473)
>>>       at
>> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1564)
>>>       at
>> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:201)
>>>       at
>> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1219)
>>>       at
>> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:144)
>>>       at
>> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:219)
>>>       at
>> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:126)
>>>       at
>> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
>>>       at
>> org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:335)
>>>       at
>> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
>>>       at org.eclipse.jetty.server.Server.handle(Server.java:531)
>>>       at
>> org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:352)
>>>       at
>> org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:260)
>>>       at org.eclipse.jetty.io
>> .AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:281)
>>>       at org.eclipse.jetty.io
>> .FillInterest.fillable(FillInterest.java:102)
>>>       at org.eclipse.jetty.io
>> .ChannelEndPoint$2.run(ChannelEndPoint.java:118)
>>>       at
>> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:333)
>>>       at
>> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:310)
>>>       at
>> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:168)
>>>       at
>> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:126)
>>>       at
>> org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:366)
>>>       at
>> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:762)
>>>       at
>> org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:680)
>>>       at java.lang.Thread.run(Thread.java:748)
>>> 
>>> 2018-12-28 15:50:58.809 INFO  (qtp1769193365-15) [   x:collection1]
>> o.a.s.c.S.Request [collection1]  webapp=/solr path=/select
>> params={q=ac6023*&defType=lucene&qf=tm_field_product^21.0&qf=tm_title_field^8.0&rows=10&wt=xml&debugQuery=true}
>> status=400 QTime=5
>>> 
>>> Could this be a multi-term issue, but EDisMax does not actually raise an
>> exception???
>>> 
>>> 
>>>> On Dec 27, 2018, at 9:32 PM, Alexandre Rafalovitch <arafalov@gmail.com>
>> wrote:
>>>> 
>>>> EchoParams=all
>>>> 
>>>> May also be helpful to pinpoint differences in params from all sources,
>>>> including request handler defaults.
>>>> 
>>>> Regards,
>>>>  Alex
>>>> 
>>>> On Thu, Dec 27, 2018, 8:25 PM Shawn Heisey <apache@elyograg.org wrote:
>>>> 
>>>>> On 12/27/2018 10:47 AM, Kay Wrobel wrote:
>>>>>> Now starting from SOLR version 5+, I receive zero (0) results back,
>> but
>>>>> more importantly, the Query Parser produces an empty parsedQuery.
>>>>>> 
>>>>>> Here is the same query issued to SOLR 7.6.0 (current version):
>>>>>> https://pastebin.com/XcNhfdUD <https://pastebin.com/XcNhfdUD>
>>>>>> 
>>>>>> Notice how "parsedQuery" now shows "+()"; an empty query string.
>>>>> 
>>>>> I can duplicate this result on a 7.5.0 example config by sending an
>>>>> edismax query with undefined parameters for df and qf. The other
>>>>> field-related parameters for edismax are also undefined.  The following
>>>>> URL parameters with the default example config will produce that parsed
>>>>> query:
>>>>> 
>>>>> q=ac6023*&defType=edismax&df=&qf=&debugQuery=on
>>>>> 
>>>>> When a query is made and Solr's logging configuration is at its default
>>>>> setting, Solr will log a line into its logfile containing all of the
>>>>> parameters in the query, both those provided on the URL and those set
>> by
>>>>> Solr's configuration (solrconfig.xml).  Can you share this log line
>> from
>>>>> both the version that works and the version that doesn't?
>>>>> 
>>>>> This is the log line created when I ran the query mentioned above:
>>>>> 
>>>>> 2018-12-27 23:03:23.199 INFO  (qtp315932542-23) [   x:baz]
>>>>> o.a.s.c.S.Request [baz]  webapp=/solr path=/select
>>>>> params={q=ac6023*&defType=edismax&df=&qf=&debugQuery=on}
hits=0
>> status=0
>>>>> QTime=0
>>>>> 
>>>>> What I'm thinking is that there is a difference in the configuration
of
>>>>> the two servers or the actual query being sent is different.  Either
>>>>> way, there's something different.  The two log lines that I have asked
>>>>> for are likely to be different from each other in some way that will
>>>>> explain what you're seeing.
>>>>> 
>>>>> Thanks,
>>>>> Shawn
>>>>> 
>>>>> 
>>> 
>> 
>> 
>> --
>> 
>> The information in this e-mail is confidential and is intended solely for
>> the addressee(s). Access to this email by anyone else is unauthorized. If
>> you are not an intended recipient, you may not print, save or otherwise
>> store the e-mail or any of the contents thereof in electronic or physical
>> form, nor copy, use or disseminate the information contained in the email.
>> 
>> If you are not an intended recipient,  please notify the sender of this
>> email immediately.
>> 


-- 

The information in this e-mail is confidential and is intended solely for 
the addressee(s). Access to this email by anyone else is unauthorized. If 
you are not an intended recipient, you may not print, save or otherwise 
store the e-mail or any of the contents thereof in electronic or physical 
form, nor copy, use or disseminate the information contained in the email.  
If you are not an intended recipient,  please notify the sender of this 
email immediately.

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message