lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shawn Heisey <apa...@elyograg.org>
Subject Re: 6.x to 7.x differences
Date Wed, 12 Sep 2018 09:49:41 GMT
On 9/11/2018 8:32 PM, John Blythe wrote:
> we recently migrated to cloud. part of that migration jumped us from 6.1 to
> 7.4.
>
> one example query between our old solr instance and our new cloud instance
> produces 42 results and 19k results.
>
> the analyzer is the same aside from WordDelimiterFilterFactory moving over
> to the graph variation of it and the lucene parser moving from 6.1 to 7.4
> obviously.

Did you completely reindex after changing your schema?  Not doing this, 
especially if attempting to use the index from the earlier version, can 
lead to problems.  Have you checked what happens if you use the 
non-graph version of WDF (and completely reindex), so you can see 
whether that changes anything?  That filter will disappear in 8.0, but 
it's still there for all of 7.x.

Adding "debug=query" to your URL parameters is very useful in locating 
differences.  Maybe 6.1 and 7.4 are parsing the query differently.  
There's a good chance that this will reveal something we can pursue.

> i've used the analysis tool in solr admin to try to determine the
> difference between the two. i'm seeing the same output between index and
> query results yet when actually running the queries have that huge
> divergence of results.

One of the big differences between 6.x and 7.x for query parsing is that 
the sow (split on whitespace) parameter defaults to true in 6.x (and I 
think it didn't even exist in 6.1, so it's effectively true).  In 7.x, 
that parameter defaults to false.  So the query parser in 7.x tends to 
behave *exactly* like what you see in the analysis tool, whereas in 6.x 
the input would be split on whitespace before ever reaching analysis, 
which can result in very subtle differences in how the input is 
analyzed.  Adding "sow=true" to your URL parameters is something you can 
try as a quick test.

Thanks,
Shawn


Mime
View raw message