lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David Hastings <hastings.recurs...@gmail.com>
Subject Re: Phrase Fields performance
Date Tue, 04 Apr 2017 15:36:14 GMT
FYI, think i managed to get the results back and the speeds that i desired
back reducing the number of fields in the qf/pf values from 6 to 4, also
making sure to not boost the default field, and reducing the boost values
to much smaller numbers but still significant enough to boost properly, so
went from around .3 seconds pre qf/pf, above 1 sec after agressive
settings, and now back down to around half a second with modified values,
which I can live with.   also if anyone else like myself stores qtimes in a
table this is a good 15 minute rolling average sql query you may or may not
find useful:


SELECT when_done as timestamp, AVG( qtime ), count(id)  FROM qtimes WHERE
 `when_done` >=  '2017-03-23 09:00:00' AND `when_done` <=  '2017-03-23
13:00:00' GROUP BY year(when_done),month(when_done),day(when_done),( 4 *
HOUR( when_done ) + FLOOR( MINUTE( when_done ) / 15 ))  ORDER BY
 `qtimes`.`when_done` ASC;





pre qf/pf values:
| timestamp           | AVG( qtime ) | count(id) |
+---------------------+--------------+-----------+
| 2017-03-23 09:00:00 |     322.0585 |       581 |
| 2017-03-23 09:15:01 |     243.9634 |       628 |
| 2017-03-23 09:30:00 |     347.1856 |       652 |
| 2017-03-23 09:45:03 |     407.3195 |       673 |
| 2017-03-23 10:00:02 |     307.1313 |       678 |
| 2017-03-23 10:15:00 |     266.9802 |       759 |
| 2017-03-23 10:30:01 |     288.1789 |       833 |
| 2017-03-23 10:45:01 |     275.0880 |       852 |
| 2017-03-23 11:00:02 |     417.0151 |       861 |
| 2017-03-23 11:15:01 |     267.1153 |       945 |
| 2017-03-23 11:30:00 |     387.1656 |       803 |
| 2017-03-23 11:45:00 |     268.5137 |       837 |
| 2017-03-23 12:00:00 |     294.5911 |       807 |
| 2017-03-23 12:15:00 |     411.8617 |       752 |
| 2017-03-23 12:30:00 |     478.3566 |       788 |
| 2017-03-23 12:45:01 |     262.2294 |       680 |



after pf/qf values but too agressive:

| timestamp           | AVG( qtime ) | count(id) |
+---------------------+--------------+-----------+
| 2017-04-03 09:00:04 |    1002.1900 |       600 |
| 2017-04-03 09:15:04 |     873.2367 |       659 |
| 2017-04-03 09:30:00 |    1013.9041 |       563 |
| 2017-04-03 09:45:01 |    1256.8596 |       591 |
| 2017-04-03 10:00:08 |    1092.8582 |       663 |
| 2017-04-03 10:15:00 |    1322.4262 |       671 |
| 2017-04-03 10:30:06 |     848.1130 |       770 |
| 2017-04-03 10:45:00 |    1039.3202 |       887 |
| 2017-04-03 11:00:00 |    1144.9216 |       536 |
| 2017-04-03 11:15:02 |     620.8999 |       719 |
| 2017-04-03 11:30:03 |     999.7113 |       665 |
| 2017-04-03 11:45:00 |    1144.1348 |       564 |
| 2017-04-03 12:00:01 |    1317.7461 |       453 |
| 2017-04-03 12:15:02 |    1413.5864 |       573 |
| 2017-04-03 12:30:02 |     746.9422 |       623 |
| 2017-04-03 12:45:00 |    1088.4789 |       568 |


and finally modified pf/qf values changed at exactly 1046 am today:


+---------------------+--------------+-----------+
| timestamp           | AVG( qtime ) | count(id) |
+---------------------+--------------+-----------+
| 2017-04-04 09:00:00 |    1079.3983 |       605 |
| 2017-04-04 09:15:04 |    1190.4540 |       544 |
| 2017-04-04 09:30:00 |    1459.6425 |       621 |
| 2017-04-04 09:45:00 |    2074.2777 |       677 |
| 2017-04-04 10:00:01 |    1555.0798 |       664 |
| 2017-04-04 10:15:00 |    1313.1793 |       697 |
| 2017-04-04 10:30:00 |    1042.4969 |       809 |
| 2017-04-04 10:45:00 |     773.2043 |       695 |
| 2017-04-04 11:00:00 |     526.7830 |       788 |
| 2017-04-04 11:15:01 |     470.1969 |       711 |
| 2017-04-04 11:30:02 |     642.1838 |       136 |




On Sat, Apr 1, 2017 at 11:13 AM, Dave <hastings.recursive@gmail.com> wrote:

> Maybe commongrams could help this but it boils down to
> speed/quality/cheap. Choose two. Thanks
>
> > On Apr 1, 2017, at 10:28 AM, Shawn Heisey <apache@elyograg.org> wrote:
> >
> >> On 3/31/2017 1:55 PM, David Hastings wrote:
> >> So I un-commented out the line, to enable it to go against 6 important
> >> fields. Afterwards through monitoring performance I noticed that my
> >> searches were taking roughly 50% to 100% (2x!) longer, and it started
> >> at the exact time I committed that change, 1:40 pm, qtimes below in a
> >> 15 minute average cycle with the start time listed.
> >
> > That is fully expected.  Using both pf and qf basically has Solr doing
> > the exact same queries twice, once as specified on fields in qf, then
> > again as a phrase query on fields in pf.  If you add pf2 and/or pf3, you
> > can expect further speed drops.
> >
> > If you're sorting by relevancy, using pf with higher boosts than qf
> > generally will make your results better, but it comes at a cost in
> > performance.
> >
> > Thanks,
> > Shawn
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message